Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Go 1.21 may have a clear(x) builtin (utcc.utoronto.ca)
104 points by aviramha on Nov 19, 2022 | hide | past | favorite | 126 comments


Java has a specific hack for this which I discovered by accident a few years ago. Normally, given (primitive) double values d0 and d1, then boxing them does not affect equality testing - i.e. d0 == d1 if and only if Double.valueOf(d0).equals(Double.valueOf(d1)). However if d0 and d1 are both NaN, then the boxed versions ARE considered equal while d0 == d1 is false.

This inconsistency infuriated me when I discovered it but the Javadoc for Double.equals explicitly states that this anomaly is there to "allow hash tables to work properly".


Using floats as hash keys is insane, no?


Not at all.

If I get floats from $somewhere, I might want to index them into a hash data structure without creating a huge sparse array.

If you assume float numbers have "identity" such that, whatever the precision, "x == x" always holds (other than NaN), then this is a perfectly valid thing to want.


I understand your point, but really floats are not reals and we should test for equality of a float+precision.


Yes, why would you ever want a map with floating point keys?

I'm struggling to think of a valid use case where there is no better alternative.

Any design utilizing this language "feature" seems masochistic and begging to get sliced by sheet metal edges.


Say I'm in a dynamic language and have a dictionary in which an object of any type whatsoever can be a key. Why would I arbitrarily disallow floating-point objects, if a character, string, lexical closure, database connection, file handle, ..., or lists or arrays of these can all be hash keys.

A hash table of floating-point values can be used for, say, memoizing a function whose argument (or arguments) are floating-point.

A compiler could use a floating-point-keyed hash table for deduplicating identical floating-point constants. Say that constants are stored in some static area, and referenced by address: it's wasteful to repeat those constants. Some constant defining mechanisms (like #define in C) proliferate copies of a constant as a repeated subexpression.


Because all those other things are either identity types (handles, closures, etc) or are value types where the only possible way to represent them is with a single "canonical form."

But IEEE754 allows not only NaNs, but also "negative zero" and denormals. Floating-point numbers, in other words, allow for multiple different bit-encodings that represent mathematically equal, but not identical, number values. There are non-canonical forms of the "same" numbers. And programming-language runtimes don't do anything to prevent CPUs from returning these non-canonical numbers; nor do they massage them back into canonical forms upon receiving them. They just end up blindly holding these non-canonical numbers — numbers which, if they ask the CPU if they're equal, they are; but if they look at the bit-patterns and compare those for equality, they're not.

> A compiler could use a floating-point-keyed hash table for deduplicating identical floating-point constants.

Bad example. A compiler wouldn't want a hash table whose key type is a floating-point number, because that would imply that the hash table is operating using IEEE754 definition of equality via-a-vis key presence collision checking.

Rather, a compiler would use a bitstring key type, where the keys are the bit patterns representing either the target ISA's native FP-register encodings of the given floating-point numbers; or the abstract/formal bit-packed form of those floating-point numbers according to IEEE754.

The difference between the two is that the latter uses bitstring collation (ordering and equality), not floating-point collation.


value types where the only possible way to represent them is with a single "canonical form." But IEEE754 allows not only NaNs, but also "negative zero" and denormals

Unicore strings also have normalization forms. Not to make an argument, just a reminder.


True — but Unicode strings that are not bit-equal aren't collation-equal either, so that doesn't usually matter. A non-canonicalized Unicode string won't "find" its canonical counterpart in a hashtable.

IEEE754 floats (and doubles) are kind of unique among scalar types, in that the answer to "are these equal" in pretty much every programming language is delegated directly to the CPU to answer; and, in obeying IEEE754 semantics, the CPU's definition of equality for FP numbers makes things equal that aren't bit-representation-equal.


Having a generic container that handles all types of values except one only for policy reasons is rarely a good idea. If floating point instabilities should be a concern (yes), then not at the Map level. That would be a naive crutch.

Why would you ever want a number to be stored in a floating point format - that is a much better question.

Most of our numeric values don’t even correspond to the domain of floating point. For business and programming goals, -0 is a complete technical nonsense. NaN too, it should just raise an exception (there are quiet and signalling nans, everyone defaults to quiet). And so should non-low-level integer overflow, really, unless there is a software fallback. Intermediate calculations limited to a low fixed number of digits is nonsense. Accounting for constant rounding/formatting errors is also a nuisance.

Builtin, first class citizen, hardware enabled, base 10 fixed point could be the answer. But there’s almost always either bigint-only which is barely used even in a standard library, or a serious performance hit which nobody wants. There would still be issues, but much less in practice.

Floating point is a niche type suitable for “multimedia”, NNs and maybe few other special contexts. They are the default only because everyone on the hardware..runtime spectrum traditionally believes that ordinary numbers are not their responsibility.


Memoizing computations for which you know the input(s) come from a small range of numbers?

IEEE floating points are 32 bit / 64 bit, so it's not in principle any more insane than using (u)int32 / (u)int64 as keys. It's not like the key space is unbounded or even hard to estimate - it's just not a common thing to see in the wild, and I guess most devs rarely have a reason to consider how many values fit between two floating point numbers.


> Yes, why would you ever want a map with floating point keys?

One possible answer: you're in Javascript or something similar. You wanted integer keys, but all you have is floating-point numbers.


Or.. convert the integers to string form and use that. Wacky nonsense situation averted.


I agree. There are several instances of "valid" keys nobody should use.


100% with you bud.

Another one which probably shouldn't exist is a map with boolean keys.

Edit: It might be nice if a language compiler detects such bastardizations and spits out a proposal for a better alternative structure or approach, perhaps even stubbornly refusing to proceed to compile such a shitty idea.

Golang already kind of does a spiritual form of this by disallowing unused variables.

This would undoubtedly be similarly controversial. Ego ruins all.


> Yes, why would you ever want a map with floating point keys?

Because it's a cache and you didn't specifically handle that edge case.


I don't know what precisely the poster meant. However, from my perspective, I see it as a simpler design to explicitly hash float values into keys.


That... still requires the foresight to see that edge case, and being in a situation where you're not colliding your float hashes with random integers?


Externally hashing outside of the hash table requires you to take on ugly complications. What if two distinct floating point keys map to the same key? The values of your hash table then have to be lists or arrays so you can keep all colliding entries. You've then wastefully built chained hashing around a hash table!


The same approach I suggested elsewhere in this thread will work for this case, too. Convert floats to their string representations and key off that. Equality ambiguity problem solved.


Not if you’re hash table is designed to accommodate key conflicts, and the float is deterministic.


It's done all the time in JavaScript. Why not in Java?


Javascript does a ton of things that are extremely questionable, so it's not a good source to say you should do something. Rather, it's a better argument (though still flawed) to say that because it's done all the time in Javascript, it's evidence that you shouldn't do it.


I'm surprised this is done in JS. If you're doing any kind of calculation with floats (and use actual non-integer numbers, not just ints stored as floats) then it's generally difficult to get the same exact result every time on every system, due to rounding errors. That's why to my knowledge, any strict equality check on floats is generally frowned upon. (exceptions: you know all the floats are actually non-huge ints; You're comparing with a value the variable was explicitly set to at some time before)


JavaScript conceptually only has floats (ignoring big integers, which isn't what most people use for numbers), so any map of numbers to values is using a float key. You refer to that, but yeah they're all floats.

> it's generally difficult to get the same exact result every time on every system, due to rounding errors

Floating point arithmetic is 100% entirely deterministic - you don't just get different values randomly.


This is so wrong

ECMAScript does not have maps where keys can be integers and neither can keys be floats.

Everything is a pointer to a value, and otherwise reflected with a symbol or string (which in return is a unique symbol). An object doesn't have object[1] because it is actually object[reference("1")]

This way boxing and unboxing of arrays is much cheaper, due to the arrays not needing a resizing of their cells once datatypes change.

(TypedArrays are optimized in a different manner, but we are speaking about the NaN use case, which implies either an Array or Object)

There are dozens of talks about e.g. v8's hidden classes on youtube that show up this very mechanism.

No idea where you got that float keys idea from.


> ECMAScript does not have maps where keys can be integers and neither can keys be floats.

  map = new Map();
  map[1] = 3;
  map[1]; // 3
  map[1.2] = 3.4;
  map[1.2]; // 3.4
  typeof(map.keys().next().value); // "number"
I'm not an expert in JavaScript. What am I missing?


A nitpick: You're setting object properties on a Map object, which only works "accidentally" but does not make use of the map at all[1]. So in this case, it likely works because the numbers are converted to strings before being used as keys.

But in general, using float keys does work - but can fail in unexpected ways due to rounding errors. E.g.:

  m = new Map();
  m.set(0.3, "foo");  // --> Map { 0.3 → "foo" }
  m.get(0.3); // --> "foo"
  m.get(0.1 + 0.2);  // --> undefined 
That's because 0.1 + 0.2 == 0.30000000000000004 in IEEE floats, which is != 0.3. [2]

So using float keys which are derived from calculations and may contain non-integer numbers is a bad idea, because unless you have very good knowledge about floating point math, you can not easily predict what exact value the result of a calculation will be.

[1] https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...

[2] https://stackoverflow.com/questions/8503157/ieee-754-floatin...


Right - should be using set and get - was written out without running it.

But isn't this still a map that has float and integer keys? Why doesn't it count?


from what I got, it's a very stupid interaction of different javascript language features.

Every object in JS has basic map functionality. Since version 1 of the language, you could always write things like:

  var o = new Object();
  o["foo"] = "bar";
  o["foo"] // --> "bar"
However, this functionality is relatively limited: It only supports strings (and symbols) as keys and it interacts badly with other methods and properties which are part of the object. Hence why an explicit "Map" type was added to the language much later.

The problem is, Maps are also objects, so they still inherit the "old" map functionality in addition to the new functionality. Those two systems are completely independent, even though they act on the same object. So writing

  mymap["foo"] = 1
and

  mymap.set("foo", 1)
both store the mapping "foo" -> 1, but in completely different places. Only the second one will actually put it into the store of the map, while the first one will just add it as a generic object property.

You can see it when trying to retrieve the value again:

  mymap["foo"] = 1
  mymap["foo"]; // --> 1
  mymap.get("foo"); // --> undefined
likewise:

  mymap.set("bar", 1);
  mymap.get("bar"); // --> 1
  mymap["bar"]; // --> undefined


Yes the square brackets were just a typo. But maps do support float keys - you can try it yourself!

  map = new Map();
  map.set(1.2, 3.4);
  log(typeof(map.keys().next().value));
That says the key is a number, and it's a floating point number. So what did they mean by "ECMAScript does not have maps where keys can be integers and neither can keys be floats"?


Ah, I misunderstood that, sorry.

I think the GP was wrong there. JS maps absolutely do support float keys, it's just generally a bad idea to use them if you don't restrict yourself to integers.


Object property names are always strings, even though you can set them via number literals (as the person you replied to said):

    const object = {}
    object[1] = 3

    console.log(object["1"]) // 3
    for (const key in object) {
      console.log(typeof key) // "string"
    }
This is different from keys of the Map data structure, which are actually able to be any type of value (even silly stuff like other Maps).


I don't know - I think that proves what I said - map keys can be floats.


For sure, I never disagreed with that (I even said "JavaScript maps absolutely can use numbers (or any type) as keys" in my other comment).

But object property names cannot be floats (which is what your example was, despite them being properties of a Map object).


What runtime did you use for that?

JavaScript maps absolutely can use numbers (or any type) as keys, but that's not how its API works. Square brackets access object properties, but map entries are not stored like that, and that last `typeof` is `"undefined"` in every runtime I tried (not `"number"`). Try `map['set']` to see another example.

Here's what I think you meant:

    const map = new Map();
    map.set(1, 3);
    map.get(1); // 3
    map.set(1.2, 3.4);
    map.get(1.2); // 3.4
    typeof map.keys().next().value; // "number"


Square brackets were a typo - I didn't run it I just wrote it off the top of my head. Doesn't your correct prove what I said though?


We're certainly in agreement that cookiengineer's "ECMAScript does not have maps where keys can be integers and neither can keys be floats" assertion is dead wrong.


Where does Map.__proto__ point to? There's your answer :)

Everything in JS is an Object, therefore everything can have hashed keys. Including Arrays, because Array.__proto__ also points to Object.


You’re being pedantic. Nobody is talking about internal representations, they’re saying it’s literally possible to key into a map with a float, which is true.


No it's not, because they are talking about keying via Number.prototype.toString(), which is not keying via float but via its string representation.

And that's my point, it's even part of the ECMAScript spec. See 6.1.6.1 and 6.1.7.1 [1]

Additionally, a proof you can quickly tryout:

object={};

object[-0]=123;

object[+0]; // returns object["0"];

If Object key were a Number type, it would not use Number.prototype.toString().

You can literally create your own datatype and override valueOf() and toString() to play with this.

[1] https://262.ecma-international.org/13.0/


What do you think is happening when you use set with a float on a map?

How do you explain this?

    const map = new Map();
    map.set(1, 3);
    map.get(1); // 3
    map.set(1.2, 3.4);
    map.get(1.2); // 3.4
    typeof map.keys().next().value; // "number"
Where do you think ‘number’ comes from?

Read the spec - it supports primitives for keys and it doesn’t stringing them. As others have said in this thread - you’re just wrong.


It's done in JavaScript because there is not integer type. (Ok, ok, nowadays there is bigint, but historically.) So you do all int like calculations with number (which is IEEE double), but be sure to only do arithmetic where the result is still an integer. That's why you end up using a double as a numeric key. And I think a double can basically hold a 53 bit integer exactly.


This is definitely not my area of expertise, but my understanding was that the mantissa was actually the source of inexactness in floats? I know the power bits (10 bits?) are pretty close to actual twos complement ints, and you could always do things in a C-like union, but I wasn't aware you could directly treat the mantissa like an int


Ok, that's different then. Using integers as keys should not be a problem, even if they are stored as floats. Problems start if you use non-integers or IEEE special values as keys.


No. JavaScript does that all the time.


(Because JS has no integers)


That used to be the case but JavaScript has BigIntegers now.


JavaScript actually converts all non-Symbol keys to strings.


Yet another reason not to do it. Still, if your floats are always integers under a certain size, it works fine to use floats as hash table keys.


Note that the contract for the equals() method requires that the method always returns true when called on itself. So the designers had to decide whether to break the contract on NaN or the contract on equals(). I could imagine that honoring equals() was the solution with the least expected breakage with downstream users.


The issue with NaN equality is interesting, but is that really why they're adding a clear(x) builtin? What if you want to remove a single-NaN key from a map? clear(x) seems like a band-aid at best if the Go team is strictly trying to fix removing NaN keys in maps.


The way Rust dels with this issue is by saying that the Key must implement both `Hash` and `Eq` where Eq is a strict equality compared to `PartialEq` which allows the equality relation to be incomplete. While integers implement both PartialEq and Eq, floating point values only implement PartialEq.

https://doc.rust-lang.org/std/collections/struct.HashMap.htm...

https://doc.rust-lang.org/std/cmp/trait.Eq.html

https://doc.rust-lang.org/std/cmp/trait.PartialEq.html

There is also a partial ordering relation `PartialOrd` (and `Ord` for complete relations) Floating point values only implement the partial relation as well because of NaN's. This means that it is a bit harder to sort them, though the floating point standard also has a seperate complete comparison relation with some extra rules which you can use instead.

https://doc.rust-lang.org/std/primitive.f64.html#method.tota...


Rust's HashMap and similar containers have clear() and many of them have drain()

Drain is interesting, because it resolves the other problem if things are in the collection which can never be retrieved from it. Drain gives you each of the things in the collection, one at a time, to do with as you wish, the collection is now empty. So this means even if the collection has a dozen SillyNonsenses in it, all of which insist they're not the one you were looking for whenever you go looking for a SillyNonsense in the collection, you get all twelve of them out in Drain, and can examine them as you wish.

Of course if you just wanted to fish one thing out and then throw the rest away, you can do that, once you drop the result of Drain the rest of the things being drained are dropped.


I thought HashMap behaved in an undefined-but-safe manner if an implementation of Hash or Eq is incorrect. Are you saying there's a reasonable way to put misbehaving types in a HashMap?

(I'm pretty sure I recognize your username as someone with way more rust experience than me. For the crowd an implementation of Eq for floats that used the standard comparison operators would be "incorrect", which is why the stdlib doesn't provide one)


If your key type K doesn't obey the rules explained in HashMap ("if two keys are equal, their hashes must be equal") then Rust doesn't promise HashMap<K,V>'s functions will do something useful (although it will still be safe if your type is safe). So no this isn't a license to put silly things in hash maps. Non-useful but safe consequences might include spinning indefinitely.

But if you have put something daft in a hash map, drain() should definitely get it back out, so there's that.


This is one of those rough edges where Rust is more annoying than helpful imo. It's also inconsistent.

I'd rather panic on NaN comparisons than complicate the fundamental comparison traits. Treat it like integer overflow or array bounds access. We don't have different traits for indexing that might fail or overflow that might happen. There are methods on the trait to handle those cases when you care about them.

But that said, this feels like an XY issue. If you're comparing hashes of floats as keys somewhere you don't want IEEE 754 defined equivalence. You almost certainly want bit equivalence, which means the keys should be type punned to u32/u64. Classic example is memoizing function calls with floats. You don't want to compare the semantic values of the arguments, but the literal bits. For floats that's not the same.


> I'd rather panic on NaN comparisons than complicate the fundamental comparison traits.

The fundamental comparisons are complicated period. Pretending they’re not and hoping you don’t hit the branches that crash is a terrible idea for writing reliable software.

For everything where the comparisons aren’t complicated (like integers), Rust is easy. For things where the comparisons are complicated, Rust makes you acknowledge this reality and deal with it. This is correct behavior.


No you don’t, you want to use the hardware defined equality function. And that one is IEEE 754 compliant.


In a hash table, I want my equality function to be based on the actual value, not the super special comparison for a specific type.

If I write a UniqueNumber that just returns false on the == operator, I'd be stupid but the hash map should still work. Java has a separate getHashCode() and equals() method for a reason.

Either provide a hash key override or use the raw bits for deep compares. Custom equality algorithms usually don't make sense for arbitrary key-value stores.


Because of the Pigeonhole Principle what you're describing is nonsense.

In Rust you can make UniqueNumber, indeed misfortunate::OnewayGreater is such a type. misfortunate::Maxwell even more so.

However of course the HashMap can't meaningfully "work" for this type. Rust promises your program doesn't have Undefined Behaviour despite such a prank, but it might spin forever when you try to insert this type into a map for example, a well defined but undesirable behaviour brought on by your poor choices.

You seem to imagine that Java's hash map doesn't need to compare types, that it can just use getHashCode() -- however because of the Pigeonhole Principle that can't actually work.

And NaN != NaN isn't a "custom equality algorithm" it's literally how the floating point numbers are defined in your CPU.


Floating point numbers are nothing more than bits with context. They're not some kind of super special construct that only dedicated hardware can operate on. They have their own modification instructions so they incorporate the float standard, but that standard is only relevant when you're operating on numbers.

In my opinion, language native hash maps should operate on memory, not on types and their weird implementations. Negative zero and positive zero are defined as different values but are mathematically equal; however, math functions may (and in the case of C, do) behave differently depending on which one you use. Even in higher level languages such as Java sorting gets affected by the presence of negative and positive zero using min/max operators.

If two values claim equality but do effectively alter program behaviour, I consider their equality operation in the context of memory operations such as hash maps to be buggy. There are good reasons to rely on the equality operators, but I do not think such behaviour should be the default.


The Java (openjdk) hashmap putValue doesn't seem to compare types, it just uses .equals()

  if (p.hash == hash &&
                  ((k = p.key) == key || (key != null && key.equals(k))))
You can't really rely on checking types in a Hashmap anyways because the type you're hashing can have more possible values than the 32bit hash. For example you can have 2 strings that have a hash collision.

NaN's work in a Java hashmap because Float.hashCode returns floatToIntBits(float) which normalizes all NaN's to a canonical value.


Sorry, I should have said compare values of-this-type. I was focused on the fact that type does matter here, which as you show the equals method is called on key, not some arbitrary system-wide equals method but a method defined on key's type.

Rust has a magic unstable trait StructuralEq which means not only are we promised that values of this type can be compared for equality (that's what Eq does) but that comparison will just be a bitwise memory comparison. Lots of useful things are not StructuralEq even if they are Eq

Apparently Java's Floats also do the same canonicalisation for equals(). So in effect in Java although NaN != NaN, once you box it that ceases to be true. This seems like a spectacularly bad idea to me, but presumably it made somebody's awful hack work at one point.


> This seems like a spectacularly bad idea to me, but presumably it made somebody's awful hack work at one point.

but elsewhere you say, "You are welcome to build a type which has this property and declares that it is Eq". This is exactly the difference between double and Double in Java. double has the `==` operator that isn't reflexive and works as required by IEEE-754 and Double has Double.equals, which is documented[1] to be a reflexive, transitive, symmetric, consistent, and substitutable relation.

[1]: https://docs.oracle.com/en/java/javase/17/docs/api/java.base...


Is it clearer if I explain that you are welcome to do things which are a spectacularly bad idea ?


How are you so sure what other people want? Using the hardware defined equality function for integers is reasonable a lot of the time.


My point is sometimes I want NaN to compare with NaN as true when I'm using a float as a key in a table.


You are welcome to build a type which has this property and declares that it is Eq.

But Rust's built-in floating point types f32 and 64 do not have that property.


> This for loop is less efficient than clearing a map in one operation

For maps with keys that are reflexive with == the Go compiler already optimizes the range loop to a single efficient runtime map clear call: https://go-review.googlesource.com/c/go/+/110055


Yes, for most cases, the for-loop is as efficient as the proposed "clear" function. I am not sure whether or not this is still true if there are NaN keys in the cleared map.


Interestingly JavaScript handles this correctly and specified Object.is equality that is different than `===` equality around NaNs so `const m = new Map(); m.set(NaN, 3); m.get(NaN)` returns 3 and `.delete` similarly works.


I'm curious why this is not implemented as a method on the map type (and others like list) instead of being a top-level builtin. I suppose it is consistent with other collection operations such as append and len... which I guess makes me wonder why those are builtins as well.


The answer is lack of generics in all previous versions of Go. There's no way to express a generic interface to have methods for map or slice (which are not interfaces, but compiler magic instead). All the operations on builtin types must be implemented as globals handled by the compiler.


Been using Go since the early days, and 'make/append' always felt clunky and out of place. Especially when you could just use literals for a map/slice...


> Been using Go since the early days, and 'make/append' always felt clunky and out of place. Especially when you could just use literals for a map/slice...

Most people do use the literals for creating the empty map/slice. "make" is mostly useful when specifying a known length or capacity, which saves unnecessary allocations and improves performance.


Russ Cox explains in the issue why they don't want it to be a standard library function:

https://github.com/golang/go/issues/56351#issuecomment-12914...

"We talked about defining maps.Clear in #55002, but how? It would not be possible to write in pure Go code. Instead it would need some kind of hidden entry point into the runtime.

In the past I've referred to that kind of library change as a "back-door language change", meaning it's a language change - it adds something not possible in pure Go - but pretends not to be one. We try hard to avoid doing that."


From the go FAQ:

> Why is len a function and not a method?

> We debated this issue but decided implementing len and friends as functions was fine in practice and didn't complicate questions about the interface (in the Go type sense) of basic types.

I imagine similar reasoning applies to "clear" here.

https://go.dev/doc/faq#methods_on_basics


That's a pretty weird rationale IMO. "It doesn't break anything" should be the minimum requirement for a major design choice, but it certainly isn't sufficient on its own. They sort of imply that making it a method would "complicate questions about the interface of basic types", but I don't really understand what that would even mean "in the Go type sense". Are they worried that someone might make an interface with a `Len` method and therefore be able to pass slices and maps to it? That honestly seems like a feature, not a bug, and certainly not worth throwing out the nicer syntax for.


In Go, it's common to create new types from the built-in ones. Here's an example: https://pkg.go.dev/net/http#Header

When you create a new type like this, you lose all methods from the original type. If the delete operation was implemented as a method on the `map` type, `Header.Del` would need to convert back to a `map` type to call it.

    (map[string][]string(h)).Delete(key)


Interesting, I was aware of that newtype wrapper pattern, but I did not realize that methods weren't automatically forwarded to the wrapped type. This isn't the design I'd personally choose, but it does seem more consistent with what I know of Go's philosophy now that I understand this.


It's hard to articulate but whenever I find myself wishing I could inherit methods in a wrapper type, the design is bad. The designs where it ends up not being necessary are always much friendlier.


Yeah, I don't think I disagree there. If you want to automatically delegate methods, you can always use a type alias, and if you want to have delegation but type strictness (i.e. not having the alias and the original being interchangeable), you can do it manually. I think the two things here that I personally don't prefer are not having some way to opt into delegation (e.g. the `Deref`/`DerefMut` traits in Rust) and the more general paradigm of interfaces being implicit (which if I understand correctly is why it would be ambiguous about whether the wrapper types implemented the same interfaces as the wrapped type), but those are much larger philosophical differences. If you do want interfaces and wrapper types to work the way they do in Go, I agree that having a method for length would probably be less consistent with the design philosophy.

I do feel like the FAQ could be a bit more explicit about the rationale though; most people who read that page and have that question aren't going to as experienced in Go, so they aren't as likely to read "interfaces on basic types" and go "aha, yes, that would be weird with wrapper types!". That said, I also don't really use Go at all due to the philosophical differences I mentioned above, so maybe my reading isn't going to be representative of their target audience anyhow.


Most obvious flaw in Go is the top-level bs. Why on earth are they adding things to that? It is off course down to other poor decisions made historically that forces this design. But anyway. They should fix the lang and scrap the top-level nonsense all together.

For a beginner in Go it makes absolutely no sense at all. Most of the builtins should be methods.


Why would they be implemented as methods when they can be functions?


In Java, the primitive type double uses the IEEE-754 semantics and java.lang.Double uses the bits by bits comparison semantics [1] so List<Double>.remove() works correctly.

[1] https://docs.oracle.com/en/java/javase/19/docs/api/java.base...


This won't help you in a future version of Java where you can write List<double>, right?


No.

What will help you is that `clear()` is part of `Java.until.Collection` so just about every container has it.


It depends if List<double> is an alias for List<@PleaseSpecialize @NotNull Double> or not.


Usage of NaN as hash key reminds me of the two uses of NULL in SQL. One is as unknown values which are never equal to anything even other NULL values. Another use is as a key for aggregate grouping. In that case the entry represents the aggregate for all the unknown values which aren't equal but still grouped together. Different uses have different meanings so invalid in one use does not invalidate other uses.


You've already got zero and negative zero as an outlying case. I'm not sure why anyone would feel comfortable using floats as keys in a map. To anyone who has done this.. why? What was the use case?


Go has no built-in set data structure, you use map if you don’t want to bring in a third party. Maybe this is why?


A set of floats sounds like an equally questionable idea as a map of float keys, for the exact same reason.


For a game I'm making, I have a helper function that takes in a specified time (as a float) and returns a (heap based) wait instruction. The times are literals sprinkled throughout the code base of callers of this function, so I maintain a cache of time -> wait instruction to reduce allocations.


Wouldn't your life be much simplier if you'd represented time as integer milliseconds? I totally get it being too late now, but I think that's an argument for making floats a little bit unpleasant to start working with.


C# had the same NaN behavior until recently: https://learn.microsoft.com/en-us/dotnet/core/compatibility/...


Oof, I don't like that Matrix3x2 type. "It's a 3x2 matrix, but we're going to pretend it has a last column going 0/0/1 so we can get a determinant or multiply them together, but when we negate or add or subtract them it's still going to be 0/0/1."


It's the classical representation of affine transforms (translate, rotate, scale, etc) using homogeneous coordinates.

https://en.wikipedia.org/wiki/Transformation_matrix#Affine_t...


Ah, thanks. I figured it might be something like this but the documentation is very dry and doesn't seem to mention it.


Meh using floating point as keys for maps in any language is just asking for trouble -- it’s not just NaN

I don’t think there’s any real use case for it

I’d say clear() is good for clarity, and that’s it


> I don’t think there’s any real use case for it

Histogram where you want to keep a count for each float value you’ve seen (maybe rounded to some precision to reduce the number of buckets)

Not disagreeing with your comment, just saying it’s not uncommon


Why not just multiply each key by ten to the power of your desired precision, then convert to int?

If you're rounding anyway it's basically the same amount of work.


You can certainly map floats to int ‘bin number’ which is the index into an array. No need for a map.

But the ‘naive’ histogram is an example of a map using float keys.


Hm I don't agree that example means it's "not uncommon" -- why would you want to hash the float rather than the bucket number?

Hashing the floats themselves isn't going to give you a useful histogram.

Is there a codebase that hashes float to a useful effect? If it exists, I'd be interested to see it


This looks like catching up to GNU Awk, which allows "delete a" to clear an associative array. That's an extension over POSIX, which specifies "delete a[key]".


> This for loop is less efficient than clearing a map in one operation

That is not obvious; a compiler could have a pattern match for that exact AST pattern, and transform it to a delete operation. (Except for that pesky issue where two fail to be equivalent due to NaN keys.)

Quick and dirty, not entirely correct proof-of-concept in TXR Lisp:

  1> (macroexpand '(my-dohash (k v some.obj.hash) (print [some.obj.hash k])))
  (dohash (k v some.obj.hash
           ())
    (print [some.obj.hash
             k]))
  2> (macroexpand '(my-dohash (k v some.obj.hash) (del [some.obj.hash k])))
  (clearhash hash)
Impl:

  (defmacro my-dohash ((kvar vvar hash : result) . body)
    (if-match @(require ((del [@hash @kvar]))
                        (null result))
              body
      ^(clearhash hash)
      ^(dohash (,kvar ,vvar ,hash ,result) ,*body)))



OK, now I understand why OCaml's Float.compare function compares NaNs as equal to each other: https://discuss.ocaml.org/t/assertions-involving-nan/10762/4


I'm more surprised they went with a new built-in instead of making the compiler recognize something like

  m = make(map[A]B, cap(m))
just like it already recognized

    for k := range m {
        delete(m, k)
    }
and many other similar idioms.

(cap because len is not quite the same -- note, cap is not currently defined on maps)

EDIT: Likely because that's an assigment on m, not a mutation of it, so it can't be done e.g. in a function that gets m as argument.


That was weird. Makes sense once you know about NaN's incomparability, but still surprising.

I had to try it in Python (3.10.4, Windows on x86), it worked fine:

    >>> import math
    >>> a={math.nan: 1}
    >>> a
    {nan: 1}
    >>> del(a[math.nan])
    >>> a
    {}


This works as expected only because id(math.nan) is always the same, if you build the nan dynamically, it will fail:

    >>> d = {float('nan'): 1}
    >>> d
    {nan: 1}
    >>> del d[float('nan')]
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    KeyError: nan


It's because cpython shortcuts through identity: https://github.com/python/cpython/blob/main/Objects/object.c...

    /* Quick result when objects are the same.
       Guarantees that identity implies equality. */
    if (v == w) {
        if (op == Py_EQ)
            return 1;
        else if (op == Py_NE)
            return 0;
    }
`math.nan` is always itself, and thus this shortcut is taken and the key works.


Would anything actually break if NaN == NaN equalled true?


Yes that would mean that sqrt(-1) equals to 0/0 for example, which might open a new portal to hell.


Yes, in surprising places. But the most obvious thing is that comparing float values would become significantly more expensive.


Could you explain how/why?


Virtually all architectures have floats implemented in hardware - including NaN behavior. This means things like IEEE 754 float comparison is done in a single instruction, executed in a single cycle.

Explicitly checking for NaNs would (at least) double the cycle count in cases where that branch is not taken, and having to deal with the custom comparison is even worse. This is not helped by the fact that there isn't a single NaN value: there are both positive and negative NaNs, and there are 23 bits which are usually ignored but for which some values do have specific meaning.

To make it even worse, modern CPUs include vector extensions, allowing you to operate on multiple values at once. With AVX-512, you can compare 32 floats per clock cycle. I would not be surprised if switching to a custom comparison made some edge cases over 100x slower.


Comparing the raw bits using instructions intended for integers is quite fast.


NaN is not a single value, it's a multitude of them: a NaN is defined by all exponent bits being set to 1, so in a double there are 2^53 different possible NaN bit patterns (though only 2^52 values, because the sign bit is not part of the "NaN value", and possibly half that if you interpret signaling as a variable property of a fixed value)


Yes. Many applications do not require that at most 1 nan and at most 1 zero be allowed in each collection but do have trouble if some items are not equal to themselves.


I can't see it. I've no idea what problem was solved by the current weird IEEE behavior.

It must have been a biggy because this solution means your domain now loses the property that, in general, x == x. This property is a fundamental axiomatic feature of any notion of equality. A non-reflexive equality isn't just weird - it's daft/stupid/surprising.


> I've no idea what problem was solved by the current weird IEEE behavior.

The same problem that's solved by having a lot of proofs say "for all y != 0, ..." If NaN == NaN, then you can no longer assume that x/z == y/z implies x == y. Which is also a fundamental (though non-axiomatic) result of arithmetic!

NaN is an effort to represent partial functions in code that usually expects totality. It's not the best way but it's pretty good, and making NaN == NaN is also not how we'd do it with decades of hindsight.


Isn't "if(x != x)" the canonical way to test whether x is NaN? That seems like quite a big thing to break.


Most languages have some kind of isNaN(value) method in my experience. Even C and C++ have an isnan method (C since ±1999, but other platforms had the macro before: https://repo.or.cz/w/glibc.git/blob/HEAD:/sysdeps/ieee754/db...)

People do probably rely on NaN ≠ NaN but I think that's more of a side effect of compatibility than anything else.


Go's implementation of math.IsNaN is

    return f != f


Hold on a sec, the whole clearing map scenario. Is it for reusing within your fn or is it trying to free memory? If latter, I’ve just been hoping GC does its job




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: