The key thing with bounds checks is to hoist them out of inner loops. If you don...

zozbot234 · on Aug 2, 2020

> The key thing with bounds checks is to hoist them out of inner loops.

The compiler can't always hoist the check on its own, because program behavior might depend on the bounds check occurring in the loop. But you can write an assert!() outside the loop to hoist it explicitly, and verify that the bounds checks are optimized away - or use unsafe unchecked access when they aren't.

IshKebab · on Aug 2, 2020

How do you write the assert? I've not heard of that before.

Oh you mean `assert!(array.len() > 100); for i in 0..100 { array[I]; }`

I don't think that guarantees that bounds checks will be hoisted. It's just a strong hint. I mean in this case it will almost certainly work, but in more complexes cases it might not and the compiler is still free to emit bounds checks without telling you.

It would be nice if there was an explicit way of forcing an error if the bounds check was not hoisted. Similar story for lots of other optimisations - autovectorisation, tail call optimization, etc.

Some game developer made a good point that sometimes fast is correct, i.e. it's actually a bug if autovectorisation or whatever doesn't happen, so you really need a way to guarantee it.

nicoburns · on Aug 2, 2020

In Rust, the way to force removal of bounds checks is to use iterators rather than a for loop.

IshKebab · on Aug 3, 2020

I don't think that forces their removal either. It just happens to help the compiler enough that it can remove them automatically most of the time.

As far as I know the only way to guarantee that bounds checks are not used in a block of code is to use `unsafe`.

dexterlemmer · on Aug 6, 2020

Iterators do indeed not force the removal of bounds check, but that's simply because they don't exist to remove in the first place. That's because iterators actually do use `unsafe` internally. They are a safe abstraction.

  // This is unidiomatic. Don't do this.
  for i in 0..100 {
    // Oops! Array indexing. Here's a nasty bounds check!
    array[i] * 2
  }

  // Very unidiomatic. Never do this! Not even as an
  // optimization. You should properly use safe abstractions
  // in stead.
  for i in 1..100 {
    // Oh, NOOOO! This isn't supposed to be C!
    unsafe { array.get_unchecked(i) * 2 }
  }

  // This is still unidiomatic unless you want to use
  // it to explicitly show your code has side-effects.
  // However, we're completely rid of the bounds check and
  // yet it's perfectly safe perfectly safe.
  // The only way this could be unsafe is if rustbelt's
  // formal proofs, reviews, audits
  // and (probably) MLOC of thoroughly tested and
  // possibly fuzzed code using this in practice
  // have all somehow missed an unsoundness bug
  // in one of the more used safe abstractions in the
  // core library for years.

  for a in array {
    // Look, Ma! No indexing, therefore no bounds check!
    // Internally `get_unchecked` is (conceptually) used
    // but it's perfectly safe!
    a * 2
  }

  // This is idiomatic Rust. Again, there's no bounds check
  // to start with, since the save abstraction knows
  // exactly how many elements are in the array and
  // that the borrow checker will ensure nobody can
  // possibly invalidate its indexing.
  // Ditto what was said above that the safe abstraction
  // is pretty much guaranteed to be sound.
  a.iter().map(|a| a * 2).collect()

connicpu · on Aug 2, 2020

Reminds me of an interesting thing I noticed, a few times the compiler has been able to make a tighter inner loop simply by adding an `assert!` on the array length before the loop. I guess the compiler considers the specific error message that will be printed to be a side effect, so the assert moves that check to before the loop instead of at some point during the loop.

pjmlp · on Aug 2, 2020

Disabling bounds checking seems to still be a cargo cult thing.

Even when using C++ I keep them enabled (via compiler options) and it is seldom a problem in the age of Electron apps fashion.

Doing something in ms instead of us is hardly an issue for 99% of applications.

Animats · on Aug 2, 2020

Right. It's mostly a thing in number-crunching code working on matrices and in some string operations. It's important to handle small inner loops well; other than that, it's seldom a big time issue.

saagarjha · on Aug 2, 2020

Speculative execution mitigations throw in another performance wrench. Might as well drop the need for them altogether, which is what happens with a lot of the integrator constructs we have in modern languages.

ridiculous_fish · on Aug 2, 2020

It's hard to know if that optimization is active, and nearly impossible to ensure that it stays active. So for this to work, it would need to be enforced, e.g. the same way that the compiler enforces that a let binding is assigned before use.

Top of my wish list is better support for debug-checked indexing. It would be extremely useful if unsafe get had bounds checking in debug mode.

touisteur · on Aug 2, 2020

Or prove the impossibility of bounds-check failing and then disable them. The perfect use-case for silver-level SPARK. It would make sense to require some proof effort when you want to disable runtime checks...

doonesbury · on Aug 2, 2020

Agree!

Animats · on Aug 2, 2020

Right. Should have said that. Often, after a check on an inner loop subscript has been hoisted, it simplifies to a tautology, like "n <= n".

gameswithgo · on Aug 2, 2020

The idiomatic thing to do in rust is use iterators, and bounds checks are elided when you do. You can't always use iterators of course, at which point llvm is there to optimize what it can.