Java has a specific hack for this which I discovered by accident a few years ago...

bobbylarrybobby · on Nov 19, 2022

Using floats as hash keys is insane, no?

zbentley · on Nov 20, 2022

Not at all.

If I get floats from $somewhere, I might want to index them into a hash data structure without creating a huge sparse array.

If you assume float numbers have "identity" such that, whatever the precision, "x == x" always holds (other than NaN), then this is a perfectly valid thing to want.

makapuf · on Nov 20, 2022

I understand your point, but really floats are not reals and we should test for equality of a float+precision.

metadat · on Nov 19, 2022

Yes, why would you ever want a map with floating point keys?

I'm struggling to think of a valid use case where there is no better alternative.

Any design utilizing this language "feature" seems masochistic and begging to get sliced by sheet metal edges.

kazinator · on Nov 19, 2022

Say I'm in a dynamic language and have a dictionary in which an object of any type whatsoever can be a key. Why would I arbitrarily disallow floating-point objects, if a character, string, lexical closure, database connection, file handle, ..., or lists or arrays of these can all be hash keys.

A hash table of floating-point values can be used for, say, memoizing a function whose argument (or arguments) are floating-point.

A compiler could use a floating-point-keyed hash table for deduplicating identical floating-point constants. Say that constants are stored in some static area, and referenced by address: it's wasteful to repeat those constants. Some constant defining mechanisms (like #define in C) proliferate copies of a constant as a repeated subexpression.

derefr · on Nov 19, 2022

Because all those other things are either identity types (handles, closures, etc) or are value types where the only possible way to represent them is with a single "canonical form."

But IEEE754 allows not only NaNs, but also "negative zero" and denormals. Floating-point numbers, in other words, allow for multiple different bit-encodings that represent mathematically equal, but not identical, number values. There are non-canonical forms of the "same" numbers. And programming-language runtimes don't do anything to prevent CPUs from returning these non-canonical numbers; nor do they massage them back into canonical forms upon receiving them. They just end up blindly holding these non-canonical numbers — numbers which, if they ask the CPU if they're equal, they are; but if they look at the bit-patterns and compare those for equality, they're not.

> A compiler could use a floating-point-keyed hash table for deduplicating identical floating-point constants.

Bad example. A compiler wouldn't want a hash table whose key type is a floating-point number, because that would imply that the hash table is operating using IEEE754 definition of equality via-a-vis key presence collision checking.

Rather, a compiler would use a bitstring key type, where the keys are the bit patterns representing either the target ISA's native FP-register encodings of the given floating-point numbers; or the abstract/formal bit-packed form of those floating-point numbers according to IEEE754.

The difference between the two is that the latter uses bitstring collation (ordering and equality), not floating-point collation.

wruza · on Nov 20, 2022

value types where the only possible way to represent them is with a single "canonical form." But IEEE754 allows not only NaNs, but also "negative zero" and denormals

Unicore strings also have normalization forms. Not to make an argument, just a reminder.

derefr · on Nov 21, 2022

True — but Unicode strings that are not bit-equal aren't collation-equal either, so that doesn't usually matter. A non-canonicalized Unicode string won't "find" its canonical counterpart in a hashtable.

IEEE754 floats (and doubles) are kind of unique among scalar types, in that the answer to "are these equal" in pretty much every programming language is delegated directly to the CPU to answer; and, in obeying IEEE754 semantics, the CPU's definition of equality for FP numbers makes things equal that aren't bit-representation-equal.

wruza · on Nov 20, 2022

Having a generic container that handles all types of values except one only for policy reasons is rarely a good idea. If floating point instabilities should be a concern (yes), then not at the Map level. That would be a naive crutch.

Why would you ever want a number to be stored in a floating point format - that is a much better question.

Most of our numeric values don’t even correspond to the domain of floating point. For business and programming goals, -0 is a complete technical nonsense. NaN too, it should just raise an exception (there are quiet and signalling nans, everyone defaults to quiet). And so should non-low-level integer overflow, really, unless there is a software fallback. Intermediate calculations limited to a low fixed number of digits is nonsense. Accounting for constant rounding/formatting errors is also a nuisance.

Builtin, first class citizen, hardware enabled, base 10 fixed point could be the answer. But there’s almost always either bigint-only which is barely used even in a standard library, or a serious performance hit which nobody wants. There would still be issues, but much less in practice.

Floating point is a niche type suitable for “multimedia”, NNs and maybe few other special contexts. They are the default only because everyone on the hardware..runtime spectrum traditionally believes that ordinary numbers are not their responsibility.

TeMPOraL · on Nov 20, 2022

Memoizing computations for which you know the input(s) come from a small range of numbers?

IEEE floating points are 32 bit / 64 bit, so it's not in principle any more insane than using (u)int32 / (u)int64 as keys. It's not like the key space is unbounded or even hard to estimate - it's just not a common thing to see in the wild, and I guess most devs rarely have a reason to consider how many values fit between two floating point numbers.

kazinator · on Nov 19, 2022

> Yes, why would you ever want a map with floating point keys?

One possible answer: you're in Javascript or something similar. You wanted integer keys, but all you have is floating-point numbers.

metadat · on Nov 19, 2022

Or.. convert the integers to string form and use that. Wacky nonsense situation averted.

yakaccount4 · on Nov 19, 2022

I agree. There are several instances of "valid" keys nobody should use.

metadat · on Nov 19, 2022

100% with you bud.

Another one which probably shouldn't exist is a map with boolean keys.

Edit: It might be nice if a language compiler detects such bastardizations and spits out a proposal for a better alternative structure or approach, perhaps even stubbornly refusing to proceed to compile such a shitty idea.

Golang already kind of does a spiritual form of this by disallowing unused variables.

This would undoubtedly be similarly controversial. Ego ruins all.

masklinn · on Nov 19, 2022

> Yes, why would you ever want a map with floating point keys?

Because it's a cache and you didn't specifically handle that edge case.

malcolmgreaves · on Nov 19, 2022

I don't know what precisely the poster meant. However, from my perspective, I see it as a simpler design to explicitly hash float values into keys.

masklinn · on Nov 19, 2022

That... still requires the foresight to see that edge case, and being in a situation where you're not colliding your float hashes with random integers?

kazinator · on Nov 19, 2022

Externally hashing outside of the hash table requires you to take on ugly complications. What if two distinct floating point keys map to the same key? The values of your hash table then have to be lists or arrays so you can keep all colliding entries. You've then wastefully built chained hashing around a hash table!

metadat · on Nov 19, 2022

The same approach I suggested elsewhere in this thread will work for this case, too. Convert floats to their string representations and key off that. Equality ambiguity problem solved.

bushbaba · on Nov 19, 2022

Not if you’re hash table is designed to accommodate key conflicts, and the float is deterministic.

chrisseaton · on Nov 19, 2022

It's done all the time in JavaScript. Why not in Java?

Ferret7446 · on Nov 20, 2022

Javascript does a ton of things that are extremely questionable, so it's not a good source to say you should do something. Rather, it's a better argument (though still flawed) to say that because it's done all the time in Javascript, it's evidence that you shouldn't do it.

xg15 · on Nov 19, 2022

I'm surprised this is done in JS. If you're doing any kind of calculation with floats (and use actual non-integer numbers, not just ints stored as floats) then it's generally difficult to get the same exact result every time on every system, due to rounding errors. That's why to my knowledge, any strict equality check on floats is generally frowned upon. (exceptions: you know all the floats are actually non-huge ints; You're comparing with a value the variable was explicitly set to at some time before)

chrisseaton · on Nov 19, 2022

JavaScript conceptually only has floats (ignoring big integers, which isn't what most people use for numbers), so any map of numbers to values is using a float key. You refer to that, but yeah they're all floats.

> it's generally difficult to get the same exact result every time on every system, due to rounding errors

Floating point arithmetic is 100% entirely deterministic - you don't just get different values randomly.

cookiengineer · on Nov 19, 2022

This is so wrong

ECMAScript does not have maps where keys can be integers and neither can keys be floats.

Everything is a pointer to a value, and otherwise reflected with a symbol or string (which in return is a unique symbol). An object doesn't have object[1] because it is actually object[reference("1")]

This way boxing and unboxing of arrays is much cheaper, due to the arrays not needing a resizing of their cells once datatypes change.

(TypedArrays are optimized in a different manner, but we are speaking about the NaN use case, which implies either an Array or Object)

There are dozens of talks about e.g. v8's hidden classes on youtube that show up this very mechanism.

No idea where you got that float keys idea from.

chrisseaton · on Nov 19, 2022

> ECMAScript does not have maps where keys can be integers and neither can keys be floats.

  map = new Map();
  map[1] = 3;
  map[1]; // 3
  map[1.2] = 3.4;
  map[1.2]; // 3.4
  typeof(map.keys().next().value); // "number"

I'm not an expert in JavaScript. What am I missing?

xg15 · on Nov 19, 2022

A nitpick: You're setting object properties on a Map object, which only works "accidentally" but does not make use of the map at all[1]. So in this case, it likely works because the numbers are converted to strings before being used as keys.

But in general, using float keys does work - but can fail in unexpected ways due to rounding errors. E.g.:

  m = new Map();
  m.set(0.3, "foo");  // --> Map { 0.3 → "foo" }
  m.get(0.3); // --> "foo"
  m.get(0.1 + 0.2);  // --> undefined

That's because 0.1 + 0.2 == 0.30000000000000004 in IEEE floats, which is != 0.3. [2]

So using float keys which are derived from calculations and may contain non-integer numbers is a bad idea, because unless you have very good knowledge about floating point math, you can not easily predict what exact value the result of a calculation will be.

[1] https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...

[2] https://stackoverflow.com/questions/8503157/ieee-754-floatin...

chrisseaton · on Nov 19, 2022

Right - should be using set and get - was written out without running it.

But isn't this still a map that has float and integer keys? Why doesn't it count?

xg15 · on Nov 20, 2022

from what I got, it's a very stupid interaction of different javascript language features.

Every object in JS has basic map functionality. Since version 1 of the language, you could always write things like:

  var o = new Object();
  o["foo"] = "bar";
  o["foo"] // --> "bar"

However, this functionality is relatively limited: It only supports strings (and symbols) as keys and it interacts badly with other methods and properties which are part of the object. Hence why an explicit "Map" type was added to the language much later.

The problem is, Maps are also objects, so they still inherit the "old" map functionality in addition to the new functionality. Those two systems are completely independent, even though they act on the same object. So writing

  mymap["foo"] = 1

and

  mymap.set("foo", 1)

both store the mapping "foo" -> 1, but in completely different places. Only the second one will actually put it into the store of the map, while the first one will just add it as a generic object property.

You can see it when trying to retrieve the value again:

  mymap["foo"] = 1
  mymap["foo"]; // --> 1
  mymap.get("foo"); // --> undefined

likewise:

  mymap.set("bar", 1);
  mymap.get("bar"); // --> 1
  mymap["bar"]; // --> undefined

chrisseaton · on Nov 20, 2022

Yes the square brackets were just a typo. But maps do support float keys - you can try it yourself!

  map = new Map();
  map.set(1.2, 3.4);
  log(typeof(map.keys().next().value));

That says the key is a number, and it's a floating point number. So what did they mean by "ECMAScript does not have maps where keys can be integers and neither can keys be floats"?

xg15 · on Nov 20, 2022

Ah, I misunderstood that, sorry.

I think the GP was wrong there. JS maps absolutely do support float keys, it's just generally a bad idea to use them if you don't restrict yourself to integers.

matt_kantor · on Nov 20, 2022

Object property names are always strings, even though you can set them via number literals (as the person you replied to said):

    const object = {}
    object[1] = 3

    console.log(object["1"]) // 3
    for (const key in object) {
      console.log(typeof key) // "string"
    }

This is different from keys of the Map data structure, which are actually able to be any type of value (even silly stuff like other Maps).

chrisseaton · on Nov 20, 2022

I don't know - I think that proves what I said - map keys can be floats.

matt_kantor · on Nov 20, 2022

For sure, I never disagreed with that (I even said "JavaScript maps absolutely can use numbers (or any type) as keys" in my other comment).

But object property names cannot be floats (which is what your example was, despite them being properties of a Map object).

matt_kantor · on Nov 20, 2022

What runtime did you use for that?

JavaScript maps absolutely can use numbers (or any type) as keys, but that's not how its API works. Square brackets access object properties, but map entries are not stored like that, and that last `typeof` is `"undefined"` in every runtime I tried (not `"number"`). Try `map['set']` to see another example.

Here's what I think you meant:

    const map = new Map();
    map.set(1, 3);
    map.get(1); // 3
    map.set(1.2, 3.4);
    map.get(1.2); // 3.4
    typeof map.keys().next().value; // "number"

chrisseaton · on Nov 20, 2022

Square brackets were a typo - I didn't run it I just wrote it off the top of my head. Doesn't your correct prove what I said though?

matt_kantor · on Nov 20, 2022

We're certainly in agreement that cookiengineer's "ECMAScript does not have maps where keys can be integers and neither can keys be floats" assertion is dead wrong.

cookiengineer · on Nov 20, 2022

Where does Map.__proto__ point to? There's your answer :)

Everything in JS is an Object, therefore everything can have hashed keys. Including Arrays, because Array.__proto__ also points to Object.

kleinsch · on Nov 20, 2022

You’re being pedantic. Nobody is talking about internal representations, they’re saying it’s literally possible to key into a map with a float, which is true.

cookiengineer · on Nov 20, 2022

No it's not, because they are talking about keying via Number.prototype.toString(), which is not keying via float but via its string representation.

And that's my point, it's even part of the ECMAScript spec. See 6.1.6.1 and 6.1.7.1 [1]

Additionally, a proof you can quickly tryout:

object={};

object[-0]=123;

object[+0]; // returns object["0"];

If Object key were a Number type, it would not use Number.prototype.toString().

You can literally create your own datatype and override valueOf() and toString() to play with this.

[1] https://262.ecma-international.org/13.0/

chrisseaton · on Nov 20, 2022

What do you think is happening when you use set with a float on a map?

How do you explain this?

    const map = new Map();
    map.set(1, 3);
    map.get(1); // 3
    map.set(1.2, 3.4);
    map.get(1.2); // 3.4
    typeof map.keys().next().value; // "number"

Where do you think ‘number’ comes from?

Read the spec - it supports primitives for keys and it doesn’t stringing them. As others have said in this thread - you’re just wrong.

panzi · on Nov 19, 2022

It's done in JavaScript because there is not integer type. (Ok, ok, nowadays there is bigint, but historically.) So you do all int like calculations with number (which is IEEE double), but be sure to only do arithmetic where the result is still an integer. That's why you end up using a double as a numeric key. And I think a double can basically hold a 53 bit integer exactly.

idealmedtech · on Nov 19, 2022

This is definitely not my area of expertise, but my understanding was that the mantissa was actually the source of inexactness in floats? I know the power bits (10 bits?) are pretty close to actual twos complement ints, and you could always do things in a C-like union, but I wasn't aware you could directly treat the mantissa like an int

xg15 · on Nov 19, 2022

Ok, that's different then. Using integers as keys should not be a problem, even if they are stored as floats. Problems start if you use non-integers or IEEE special values as keys.

vbezhenar · on Nov 19, 2022

No. JavaScript does that all the time.

ithkuil · on Nov 19, 2022

(Because JS has no integers)

yawaramin · on Nov 19, 2022

That used to be the case but JavaScript has BigIntegers now.

easrng · on Nov 20, 2022

JavaScript actually converts all non-Symbol keys to strings.

pclmulqdq · on Nov 20, 2022

Yet another reason not to do it. Still, if your floats are always integers under a certain size, it works fine to use floats as hash table keys.

xg15 · on Nov 19, 2022

Note that the contract for the equals() method requires that the method always returns true when called on itself. So the designers had to decide whether to break the contract on NaN or the contract on equals(). I could imagine that honoring equals() was the solution with the least expected breakage with downstream users.