Kotlin is eating Scala's *hype* by *hyping* how much better their ad-hoc informa...

KMag · on Feb 17, 2021

I think optimally, you have a powerful effect and dependent linear type system with adjustable strictness of checking. You'd use the very weakest level for a REPL : warn if there is no set of types that allows a function to execute without throwing a runtime type error. Most applications would be run/compiled with stricter checking. Libraries packed up for package managers would presumably be checked with full effects checking and without implicit soundness escape hatches of the kind you get in TypedScript. Security-critical libraries, such as TLS implementations, would hopefully be compiled to make full use of dependent types.

Hopefully you'd also have a lifetime system integrated with the malloc implementation. Your malloc implementation needs at a minimum a size_t header on each allocation, and you could use this similarly to how OpenJDK / Oracle's JVM lazily allocates rwlocks by "promoting" the object header's GC word to a tagged pointer to an rwlock and a copy of the original GC word. In this case, you'd probably use 2 bits of the malloc header for tagging, limiting arrays and other large objects to a maximum of 1 GB in 32-bit processes. Code for which lifetimes checked properly would completely ignore the dynamic lifetime accounting, but any code that didn't check properly would need to pass around potentially unsafe references as "fat references" as a pair of reference and rwlock. The first time an object reference hit potentially unsafe usage, its malloc header would need to be inflated to a pointer to the original size_t allocation_size and the rwlock, so that the "fat reference" could safely be made skinny and fat again as it passed between lifetime-safe and lifetime-unsafe libraries. Unfortunately for Rust-like systems that have both atomic and non-atomic reference counting, I think this means having the dynamic library headers contain a list of offsets of the non-atomic reference count operations so they can be dynamically patched if lifetime-unsafe code is ever loaded. (Or, make the safe libraries twice as big and modify the program linkage table and patch up all return addresses on stacks the first time any lifetime-unsafe code is loaded.)

The two bits for tagging the size_t allocation_size in the malloc implementation would be one bit for tagging if it was a size_t or a *struct {size_t allocation_size; size_t rwlock;} The other bit would be a "one bit reference count" for optimizing the common case where there's at most one reference to the object.

Edit: actually, the mixing of lifetime-safe and lifetime-unsafe code falls down for statically compiled systems that allow references into the middle of objects. IBM's OS/400's (i5/OS's) Technology Independent Machine Interface (TIMI) uses 128-bit pointers for everything, so maybe it's not so bad to make all references "fat references", and optimize them to regular references when escape analysis shows they will never leave lifetime-safe libraries. In any case, it's more complicated than I originally thought to efficiently mix dynamic race condition detection along with compile time elision via Rust-like compile-time borrow checking. You could use an atomic hash map to keep track of locations of the rwlocks in order to inflate thin references to fat references every time lifetime-unsafe code gets a reference from lifetime-safe code, but all of those lookups sound prohibitively expensive.

lmm · on Feb 17, 2021

Sounds overcomplicated. I've seen any number of fancy not-exactly-type-systems and they always end up having crazy edge cases, whereas a plain type system does exactly what you expected (and, crucially, does it in an understandable way) and scales arbitrarily far. Having the type system work the same way in the REPL and the rest of the language is important for that.

KMag · on Feb 17, 2021

I guess then you'd need another mechanism for dealing with re-defining APIs in incompatible ways within the REPL. It's a pretty common use case for REPLs to play around with APIs and temporarily break them.

The best alternative is to keep a map of all of the definitions with broken type checks, and refuse to generate new machine code/bytecode as long as that map is non-empty, and keep using the older definitions until again get back to a sound state of the world.

codebje · on Feb 17, 2021

GHC's REPL supports turning type errors into crashes when the ill-typed expression is used, which lets you test one part of code while another part is broken.

I think you can even extend that to compiled code, for an authentic "it crashed in production" experience.