> The existence of packages (multiple name spaces for symbols)
Sigh. It is depressing to see this kind of mistake in a book targeted towards beginners. There is already enough confusion around packages.
Packages are not "multiple name spaces for symbols". A namespace is a set of bindings, and so a namespace for symbols would be a set of bindings for symbols, and that is the definition of an environment [1], not a package.
A package maps strings onto symbols, so technically a package could be considered a set of bindings for strings, though no one actually thinks of them that way and they are never ever referred to that way. Packages are data structures that map strings onto symbols. That's all.
One can only hope that this is an isolated mistake and not indicative of what goes on in the rest of the book.
Could you explain the significance of this blunder? Or like where this distinction would be important? I see lots of confusion between "packages" and ASDF systems, but none between packages and environments. It seems to me, I write a function in a package and whether I am in the package or using it, I refer to that function by the symbol it is given in the defun, even if really in the latter case its some sugar around a string implied by the original defun. Where will it matter to think about it the right way? When I need to shadow things maybe?
Definitely take your authority on this in general, just genuinely curious.
That's a great question, and actually not so easy to answer. To understand it you have to put yourself in the mindset of a beginner, and here there are two possibilities: either you are a beginner, in which case putting yourself in a beginner mindset is trivial because you're already there, or you are not, in which case putting yourself in a beginner mindset is very, very hard because you have to actively forget (or ignore) things that you know, and that is not easy to do. I don't know which you are, but either way, I'm going to ask you to put yourself into a beginner mindset.
So with your beginner mindset on, imagine reading this sentence for the first time:
"The existence of packages (multiple name spaces for symbols) in Common Lisp is very important for allowing several people to cooperate in producing a large system."
Now, as a beginner, you don't know what "package" or "name space" or "symbol" means, though you might have some preconceived notions about these words. But embedded in this sentence is the "fact" that packages are name spaces for symbols (whatever that might actually mean), and so you tuck this little factoid away in your mind so you can try to make sense of it later because the author has taken pains to point out that whatever it means, it's "very important".
And then you read the rest of the book, where you find that the term "name space" is never mentioned again.
At this point you might find yourself scratching your head a bit. So on the one hand the author takes pains to call out packages as "very important", and tells you what they are, but then never bothers to explain what the words that define them actually mean. So you, being intellectually curious, go to try to find out so you go out onto the Internet to try to find out what a "name space" is. A reasonable place to start might be the definition in the Common Lisp Hyperspec:
> namespace n. 1. bindings whose denotations are restricted to a particular kind. ``The bindings of names to tags is the tag namespace.'' 2. any mapping whose domain is a set of names. ``A package defines a namespace.''
This seems promising, though it seems a bit odd that here a package defines a namespace, rather than a package is a namespace. Maybe "defines" and "is" are actually synonyms? But the bigger problem is that this definition is chock-full of new words which you as a beginner don't know the meaning of, most notably "binding". So you go to find out what that means, and your first step is to go back to the book, which turns out to be no help because despite the fact that it uses the word "binding" it never actually defines it, and also when the word is first introduced it is used as a verb, not a noun. So back to the Hyperspec:
> binding n. an association between a name and that which the name denotes. ``A lexical binding is a lexical association between a name and its value.'' When the term binding is qualified by the name of a namespace, such as ``variable'' or ``function,'' it restricts the binding to the indicated namespace, as in: ``let establishes variable bindings.'' or ``let establishes bindings of variables.''
Whoa! That's a lot of new words, starting with "name", which is hyperlinked so it has a definition of its own:
> name n., v.t. 1. n. an identifier by which an object, a binding, or an exit point is referred to by association using a binding. 2. v.t. to give a name to. 3. n. (of an object having a name component) the object which is that component. ``The string which is a symbol's name is returned by symbol-name.'' 4. n. (of a pathname) a. the name component, returned by pathname-name. b. the entire namestring, returned by namestring. 5. n. (of a character) a string that names the character and that has length greater than one. (All non-graphic characters are required to have names unless they have some implementation-defined attribute which is not null. Whether or not other characters have names is implementation-dependent.)
Double whoa! What is an "identifier"?
> identifier n. 1. a symbol used to identify or to distinguish names. 2. a string used the same way.
OK, at least now this is a short definition, with only two new words, SYMBOL and STRING. So what is a symbol?
> symbol n. an object of type symbol.
Well, that's not very helpful. And at this point you might be forgiven if you decide that this whole Lisp thing is not really worth the bother and you really ought to go learn Rust instead because that seems to be what the cool kids are using anyway.
The Right Way to explain this to a beginner IMHO is to start with strings because everyone has an intuition about what those are which is close enough: a STRING is a sequence of characters (which are a complicated topic in heir own right, but a naive view of what characters are is good enough to start with). A SYMBOL is a thing with a NAME, which is a string, and which cannot be changed. A symbol is created with a name, and it retains that same name forever. A PACKAGE is just a map, a function, from strings onto symbols such that the name of the symbol mapped from string S has the name S. This is significant because it insures that there is one and only one symbol with the name S in a given package at a given time. Packages are COLLECTIONS OF SYMBOLS WITH UNIQUE NAMES stored in a way that allows you to efficiently find the unique symbol with any given name in that collection. That's it (or at least close enough for a beginner). The reason this matters is that it allows different people to write code without stomping on each other. Alice can put her symbols in one package, and Bob can put his symbols in a different package, and so when Alice types "foo" into her code she can count on that referring to the One Symbol Named "foo" in Alice's package, and likewise when Bob types "foo" into his code he can count on that referring to the One Symbol Named "foo" in Bob's package. But these are nonetheless two different symbols (with the same name) and so Alice's code doesn't stomp on Bob's code.
That's it. ~250 words with no unfamiliar terminology for a beginner to have to scratch their head over (assuming they already know what a "map" or a "function" is).
This makes sense, but I guess I don't see, still, how this might get the beginner into too much trouble. Pedagogical methods and statements will always not quite stay true to the letter of the spec. When we teach, we say one thing "is like" another, or you "might understand this by comparing it to foo," knowing well that the concept isn't actually foo. This is never meant to confuse the student, but to help them get in the right position to actually understand something.
But still, I get your point. Certain diligent students might read that statement and, rationally, read a lot into the terms "namespace" and "binding," and then get confused when they look up those terms in the hyperspec. It does come across careless in this way. But I just don't think, in itself, this would lead to a confusion about what packages are good for, or why they are used. Whether a beginner wrongly considers a package a namespace, or something like a namespace, wouldn't affect the way they would decide to actually use a package in their work. (Not to disregard how an experienced, real-lisp-understander might use their precise knowledge of the way packages work to make even better/cleaner/organized code.)
But either way, thank you for the thoughts, and please take my thoughts with a grain of salt.
The problem is not so much with the use of the word "namespace" per se, but with saying that packages are name spaces for symbols. That's just wrong, and is in direct conflict with the definition of things that are actually talked about as name spaces for symbols, most notably, variable and function bindings. The difference is that the domains of these "name spaces" are disjoint. The domain of packages is the set of strings, but the domain of (say) lexical variable bindings is the set of symbols. So, for example, lexical environments and dynamic environments are name spaces for symbols because their domains are symbols. If you insist on thinking about packages as name spaces (which no one actually does, which is another reason the sentence is misleading) then they are name spaces for strings, not for symbols.
I get the impression that the original quote from the book is using "name space" as it's used in other languages outside of CL, except that the book is from 1992 so not sure if that usage was widespread yet - maybe in C++? But it's almost like they wanted to say "a package is analogous to a name space as it's used in [algol-derivative]." Definite potential for confusion there, since "namespace" has a very specific meaning in CL and in general helps explain the difference between Lisp-1 and Lisp-2 (Lisp-n) varieties.
The quote is in the Preface, which is outside of the book itself. The same paragraph acknowledges that "packages can be very confusing for Lispers who have not learned about them in an organized way" and that the author has "seen experienced, Ph.D.-level Lispers hack away, adding qualifications to symbol names in their code, with no understanding of the organized structure of the package system." Furthermore, he adds that "[t]he reasons that packages are even more confusing in Lisp than in other, compiler-oriented languages, such
as Ada, is that in Lisp one may introduce symbols on-line, and one typically stays in one Lisp environment for hours, losing track of what symbols have been introduced. A symbol naming conflict may be introduced in the course of debugging that will not occur when the fully developed files are loaded into a fresh environment in the proper order."
After that, nobody in their right mind should be hanging on to any preconceived notion that they might have spun out of the earlier phrase "multiple name spaces of symbols", and should be reading the actual book.
> nobody in their right mind should be hanging on to any preconceived notion that they might have spun out of the earlier phrase "multiple name spaces of symbols"
Except that "namespace" is a term of art which is actually defined in the Common Lisp standard, and so it is not unreasonable to suppose that this is the intended meaning in a book on Common Lisp.
namespace n. 1. bindings whose denotations are restricted to a particular kind. ``The bindings of names to tags is the tag namespace.'' 2. any mapping whose domain is a set of names. ``A package defines a namespace.''
So if "the bindings of names to tags is the tag namespace", that means that a set of bindings of names to symbols is a symbol namespace!
According to the Common Lisp Glossary, the values of the dictionary define what the namespace is of. Of course the keys are always names; that's what makes it a namespace.
Also, look: "a package defines a namespace" (and it must be one of symbols).
You're not catching crafty old Shapiro red-handed in anything here.
> So if "the bindings of names to tags is the tag namespace", that means that a set of bindings of names to symbols is a symbol namespace!
No, that does not follow. Even if one were to admit the parallel construct here, the result would be "the symbol namespace" which is clearly nonsense.
Neither "tag namespace" nor "symbol namespace" is a term of art, so here we are in the domain of natural language, and natural language is irregular and ambiguous. In natural language usage, a "namespace for symbols" is one where symbols are the domain, not the range. In particular, the most common usage is to distinguish between Lisp-1 and Lisp-2, where the former has a single namespace for symbols and the latter has at least two, one for values and one for functions.
A binding being an association between a name and a value is a defect in Common Lisp.
The correct view (in a language with mutable variables) is that it's an association between a name and an abstract location where a value is stored. When we assign to a variable, the binding doesn't change; any closure which has captured that binding sees the new value.
On entry into a lexical scope, fresh bindings are allocated only once, no matter how many times a new value is assigned to any of them in that same scope.
Schemers get this right. In R7RS:
An identifier that names a location is called a variable and is said to be bound to that location.
The problem is not just in the Glossary. defvar is documented as leaving the variable unbound if it is previously unbound (in the case when no initial-value is specified).
Yet, defvar is described as establishing the name as a dynamic variable. But a variable is defined as a binding. Thus if X is unbound and (defvar X) leaves it unbound, then it is not establishing a variable because that would require a binding. Oops!
> A binding being an association between a name and a value is a defect in Common Lisp.
Well, it would be if that is how CL actually defined the word "binding" but it's not. The CL glossary defines "binding" simply as "an association between a name and that which the name denotes". It does not specify that "that which the name denotes" must be a value, and in fact there are name spaces (collections of bindings) in CL where the things denoted by names are not values. The TAG namespace, for example.
The CL authors did make a few mistakes, but the definition of "binding" is not one of them.
Well, yeah, but the chapter on strings is chapter 5. Shapiro doesn't start with strings, he starts with numbers, which I think is a catastrophic mistake. When I say start with strings I mean start with strings. Actually, start with characters, i.e. start by pointing out that the fundamental units of computation when you interact with a computer using a keyboard are things like these:
a b c d e f g h i j k l m n o p q r s t u v w x y z
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
0 1 2 3 4 5 6 7 8 9
~ ! @ # $ % ^ & * ( ) _ - + = { } [ ] : ; " ' < > , . / ?
Then talk about how stringing these things together in sequences can denote different things, but that these denotations are just conventions. For example, by convention we denote strings using double quotes:
"This is a string"
Note that there are two ways you can look at the above. You can see it as a sequence of 18 characters that starts and ends with quotes, or you can see it as a sequence of 16 characters that starts with T and ends with g. This ambiguity leads to a whole host of problems, not least of which is that if you want to write a string that includes a double-quote mark you now have to somehow indicate that the embedded double-quote does not denote the end of the string but it intended to be a constituent of the string, and the fact that the same character is used to denote both the start and end of strings was actually a catastrophic design error but we're stuck with it now because of the weight of history. (The Right Way to denote strings is with balanced quotes «like this» but that ship sailed a long time ago.)
And then, once the student understands strings, you can start talking about how some strings, like "123", stand for numbers, and how this is also just a convention, because strings like "123,000.00" look like numbers to any educated human but don't stand for numbers in any programming language except Microsoft Excel again because history, yada yada yada.
The point is, numbers are really really complicated, even more complicated than symbols, and they are definitely not the right place to start teaching any of this notwithstanding that this is where everyone starts.
Shapiro does a very good job of emphasizing the difference between an object and its multiple printed-representations (S-expression). He promises to do that in the Preface, and delivers.
Computer Science once used to be the same thing as Numerical Analysis; that has left a deep imprint on the education. Here is how we use Lisp: (+ 2 2) evaluates to 4.
> Shapiro does a very good job of emphasizing the difference between an object and its multiple printed-representations (S-expression).
I think that is debatable. When he introduces S-expressions in chapter 3 it is in the context of a chapter on lists, not S-expressions. (In fact, he doesn't have a chapter on S-expressions!) And he doesn't actually define S-expression, he only defines "list S-expression" and leaves it up to the reader to infer that numbers are S-expressions -- or is it the printed representation of a number that is an S-expression? Shapiro never actually says. So is 123 a number? Or is it an S-expression denoting a number? Are these the same thing? Again, Shapiro never actually says. AFAICT, at no point in the book does he ever make it explicit that an S-expression is a string, and in particular, a string which is a serialization of a data structure.
It makes sense to you because you already know how it works. You need to read it with your beginner-mindset hat on to see the problems.
But I had read that with an actual beginner's* mindset. Since then, not once have I had the thought that Shapiro misled about this or that.
In the Preface there is a paragraph Package Systems, S-expressions and Forms where Shapiro explains what those mean. It's clear that he's not using S-expression just to refer to compound syntax. The paragraph concludes:
In this book, I distinguish the S-expression from the form—the printed representation from the object—in Chapter 1 and continue making the distinction consistently and explicitly through the entire book.
There is only small matter there in that Common Lisp uses form for an expression in an evaluated context. What Shapiro wants there is "distinguish the S-expression from the (internal) expression".
Some of that paragraph also rather belongs in the book proper rather than the Preface.
Chapter 3 does not leave it to the reader to infer that number tokens are S-expressions. It says so explicitly: "According to this definition (1 2 3.3 4) is a list S-expression, since 1, 2, 3.3 and 4 are S-expressions (denoting numbers)". That's just re-iteration; prior material in the book hammered the point that the printed representation of any object is a S-expression.
---
* Well, a Lisp beginner's mindset. Not a programming beginner's mindset. If you already know things like that compilers scan textual numeric tokens, turning them into binary numbers, that colors the interpretation.
> prior material in the book hammered the point that the printed representation of any object is a S-expression
Yes, he does say that. The problem is that this is wrong. There are many printed forms of objects that are not S-expressions. In fact, these are so common that CL has some fairly extensive infrastructure for dealing with these cases.
S-expressions have nothing to do with printing (except insofar as Lisp makes an effort to maintain read-print consistency in some circumstances), they have to do with reading. They are operational at the beginning of the read-eval-print loop, not at the end.
Yes, obviously, most of those chapters are concerned with what we type into the text file or REPL, which is often something that was not ever printed.
There are many examples of #< notation in the book, but, as far as I can see, no remarks are made about what that means, or even that the contents of #<...> are implementation-specific and may appear differently. I don't see any discussions of the concept of print-read consistency, and so on: that objects can sometimes be printed in a way that either cannot be read at all, or worse, that produces a different object, like the #: notation.
It would help the book to talk early about print-read consistency. What it is, when do we have it, when do we not have it, in what situations can we provide it for ourselves when we don't have it, etc.
Compiling Lisp isn't covered in the book; there is only a cursory mention of compile-file. The omission is a lost opportunity to discuss Lisp's "interactive approach" to compilation. In compilation there is the issue of literals: what kinds of objects are externalizable. That relates to printing because externalization is a kind of printing. Compile-file has to print the literal objects into some kind of bits in the file, which then recover a similar object.
The word "image" doesn't appear in the book; it doesn't look as if image saving is mentioned anywhere.
Shapiro does a good job of covering packages, probably better than most if not all, other books.
Because I started in Lisp with that book (back in late 1999 or 2000? I can't remember), I was never confused about how packages and symbols work.
What the book does right is explain, early, that symbols are objects that have a name which is a character string, and which is different from the token syntax so that Foo and FOO and |FOO| are the same symbol:
Quote from the Symbol chapter:
> Returning to our discussion of symbols, every symbol has a print name, or
simply a name, which is a string of the characters used to print the symbol.
The name of the symbol frank is the string "FRANK". You can see the name
of a symbol by using the function symbol-name
After that it goes into how we can get any characters we want into a symbol name, and so on.
Then, early in the Packages chapter:
> In this chapter, you will see that several different symbols can have the same
symbol name—as long as they are in different packages.
Thanks to starting with this book, I had an accurate mental model of packages from the get go.
The "name spaces for symbols" phrase appears only in the Preface. Firstly, nobody should be taking a passing remark in the Preface as a lecture on how packages work. Secondly, the phrase has a straightforward interpretation which is squarely correct. A symbol has a name which is a string. That name exists in a namespace, which is a package. If we understand "name" to mean "symbol name" (and not "symbol used as a name") then a package is a namespace, and it is of symbols.
In that sense, a lexical scope is a namespace for variables and function bindings, and a class object contains a namespace for slots. Those namespaces are not for symbols, they are based on symbols being the names; i.e of symbols.
If you don't know anything about Lisp symbols and packages, and are reading only the Preface, you will probably not understand what exactly "name spaces for symbols" means; you will just have to read the book, and carefully go through the Symbols and Packages chapters.
No, they aren't. The strings "Foo" and "FOO" and "|FOO|" might be read by the reader as the same symbol, but then again, they might not. It depends on a great many things.
By default, all else being equal, yes, reading these three strings will produce the same symbol. But there are any number of factors that can change this.
Clozure Common Lisp Version 1.12.1 (v1.12.1-10-gca107b94) DarwinX8664
? (setf x 'foo)
FOO
[Stuff elided]
? (eq x 'foo)
NIL
[More stuff elided]
? (EQ 'foo 'FOO)
NIL
Figuring out what I left out is left as an (elementary) exercise (though the fact that I had to type EQ instead of eq in the last line is a big clue).
And I know that you know this. My point is not that you don't understand symbols or name spaces; I know you do. My point is that explaining this stuff is hard, and very few people seem to be willing to put in the effort. And this is not unique to CL, and it's not even unique to software, or even to STEM. It seems to be endemic in the human condition. But that doesn't mean one should not lament it or try to improve it.
It's probably not necessary to mention that the treatment of case and whatnot is default behavior. That can be covered in an advanced chapter about read tables. There we can say, oh we lied when we said that foo, FOO and |FOO| read the same; this is actually highly programmable.
The student will not encounter that unless they explore someone else's code, or discover the features like readtable-case and experiment.
Just talking about the default behavior, if it is stable and portable, and doesn't mysteriously flip behind your back, is fine.
Even if you leave case out of it, it is not true that Foo and FOO are necessarily the same symbol, or even that FOO and FOO are necessarily the same symbol. In fact, the whole point of packages is that you can have two different symbols with the same name.
Sigh. It is depressing to see this kind of mistake in a book targeted towards beginners. There is already enough confusion around packages.
Packages are not "multiple name spaces for symbols". A namespace is a set of bindings, and so a namespace for symbols would be a set of bindings for symbols, and that is the definition of an environment [1], not a package.
A package maps strings onto symbols, so technically a package could be considered a set of bindings for strings, though no one actually thinks of them that way and they are never ever referred to that way. Packages are data structures that map strings onto symbols. That's all.
One can only hope that this is an isolated mistake and not indicative of what goes on in the rest of the book.
---
[1] http://www.lispworks.com/documentation/HyperSpec/Body/26_glo...