It seems like there's a tension between making implementation details part of the API (so you can catch mistakes) and hiding detail (which improves reusability and makes changes easier).
For example, if every function is allowed to allocate memory whether it does or not, then changing the implementation of a function so that now it allocates memory is a simple, backward compatible change. If you have to declare that you allocate memory (for example, by adding a new argument, as in Zig), making the same change breaks compatibility, so now you have to change all the callers.
Depending on the circumstances, that might not be considered worth the trouble. It's a reason that Java's checked exceptions aren't popular. Often, reporting a new kind of error shouldn't be a compatibility break; they're all going to get caught and logged the same way anyway.
(An extreme case would be making performance guarantees part of the API, which, outside maybe real-time environments, nobody does. This means that a new version of a library could be much slower than before, maybe causing your servers to become overloaded, without any compiler warning. But performance depend so many factors that making guarantees is usually unfeasible.)
The proposed idea of having a compiler do certain context checks without declaring anything in the API means that there are invisible API constraints that are generated from the implementation. This is sort of like having type inference without the option of declaring a function's type explicitly.
This is great! Not only does the context make it clear what can be done, it could be the only way it could be done. I.e. this can be a capabilities-based system where being able do to something is the 'permission' to do so. I don't know how that would work out when 'main' has all the capabilities (granted to the program/user running it) and it and each called function has to filter out what it wants to disallow (or only keep what it allows) to pass to called functions. Should be fine for a small number of capabilities but could get out of hand if too fine-grained.
This line of thinking led me to think: global variables should be, rather, considered good!
Plain text is the problem. We write code so it can be easily read as plain text. Not as living, breathing, running code.
Globals are considered bad because they are hard to trace and reason about. But this is only so with plain text. You cannot click on a variable and see all it's dependencies. We must painstakingly trace these.
If we visualized the dependencies and the flow of our code, we could see the complexity as we go. And the "context" becomes automatic. There is no difference between explicit and implicit context.
The key for globals, is to have one large graph data structure where everything is related. All your data is modeled the same. No mini-databases spread out over the code. No ad-hoc indexes strewn throughout. No modularity. One single giant data model.
I really feel the same as the article. Context should be built into programming languages, in broad strokes as designed here.
I note that in Scala (and apparently swift) the implicit parameters are type based rather than name based. This works great for “singletons” but is more restricting than the coeffect work referenced. It does solve the problem that having just a single namespace for implicit parameters is bad - nominal types live in their parent namespace.
I’d combine the two and allow to define “context parameters” in module scope. Your base logger library could have a “slot” for the logger, for example. The standard library an allocator. Etc. You’d replace them with `with` or similar, and `with Logging.@logger = Logging.nullLogger()` might let you disable logging within a dynamic context (rather than using `without`). I’d allow `with` to be written in series in a block much like `let`, rather than requiring a bunch of indentation (they would fall out of scope at the end if the block anyway).
Anyway I’d love to have time to contribute towards something like this!
I'm really interested in the topics discussed in this article. My new Haskell effect system, Bluefin[1], is based on this idea of "contexts" (which Bluefin calls "handles"). Bluefin explicitly passes handles. There's another effect system called effectful[2] which does the implicit version. I prefer the implicit version and disagree with the claim in the article that it will lead to dozens of arguments. What we normally do when we have dozens of things that belong together is that we package them all up into data structures/records/objects that group related functionality. I expect that approach to be applicable to passing contexts too.
> I prefer the implicit version and disagree with the claim in the article that it will lead to dozens of arguments. What we normally do when we have dozens of things that belong together is that we package them all up into data structures/records/objects that group related functionality.
This was the suggestion of "A Second Look at Overloading" [1], where they eliminate class declarations and simply declare which individual functions/operators are overloaded.
From the context (hehe) I think you meant you prefer the explicit version? Personally I'd prefer to make things explicit, even if it's more verbose.
> Bluefin is a Haskell effect system with a new style of API.
> It is distinct from prior effect systems because effects are accessed explicitly through value-level handles which occur as arguments to effectful operations.
LISPs had dynamic scoping for this decades ago and it fell out of fashion. I think a large part of that is because in practice it turns the code into spaghetti, and you can make it much simpler by just adding the "context" as a function argument. No special language features needed.
When everything is scoped dynamically, it is indeed a problem. If you only have a handful of things with a clear syntactic mechanism to indicate when they are being used? Much clearer.
I impkemented a slow version of contexts in Smalltalk, just to see how bad it was. Where it shone was in places where you can't pass a context as an arg (eg. binary math operators) either for compatibility reasons or because passing the context would be too noisy.
I disagree it's clearer, because it's a spooky action at a distance. An analog are thread locals, which have the same characteristics of a dynamically scoped variable.
I've found dynamic scoping useful so few times I can count them on one hand, and while it's an intriguing feature, I think it makes a lot of sense to use a more explicit abstraction. For example, you shouldn't need dynamic scoping for overloading binary operators - you should have a more clear method that takes a context as an argument instead.
Context is external state, and it's extremely surprising when a fundamental operator like "+" modifies it.
Of course that makes for implicit global dependency, unless you pass a context manager ref around almost like Go does
But at least it'd help scope resources instead of using globals while avoiding a coloring problem (see Go context where if an interface doesn't take context one has to find an alternative means to passing context through)
In Python, thread-local variables and ContextVars get you quite a bit of the way there.
You can even implement “without” patterns, albeit only as runtime checks - though if you’re following the React practice that hooks must be called unconditionally, and you have test coverage, you essentially have compile time checks.
And, speaking of coloring problems, if you’re running on gevent, both become greenlet-local, and you can benefit from concurrency limited only by working memory, without needing explicit async code.
OT, and the HN front page has moved on, but just in case someone is interested, and potentially in chatting... A year or so ago, I was exploring live-ish coding with small code fragments and extremely rich namespace manipulation. Sort of "PL is about naming"(so make that awesomely powerful) + "multi-phase collaborative-compilation extensible-language"(and give those namespaces control over everything). Context on steroids.
Perhaps think of javascript-like objects, but much less impoverished. With multiple inheritance prototyping. And fields that not just shadow, but can compose array/object values. And aliases, renames, deletions, set ops, etc, etc. Extremely succinct specification of namespace derivations and variants. Sort of a configuration system. "Object A is composed of interwoven parts of specifications B and C, and when creating variations on A, the contributions of those parts can be altered by name".
And the namespace then becomes the parse, compile, runtime, and dynamic scope contexts. For Maude-like localized control - "in this lexical context, I want integers absent, numbers to parse as Decimal, use this type system, be implemented using this library, with this allocator, ...". So a lexical scope can begin with `use C99;` or `use Python 3.11;`, and those are "simply" two unremarkable computation contexts. From past experience, it's nice to have a forcing factor of "ok, if we were merely implementing one language then mumble would be satisficing, but when doing n languages, we need to up our game by ...".
My next step was going to be putting together an illustrative demo using the great diversity of computations one might describe as say "2+2". Integers and float types, with diverse and not-always-contiguous bit layouts, cpu flags, finite rings, etc, etc. It ended up backburnered, but summer is coming, and this thread reminded me of it.
It's not really missing in all languages. Haskell has context in the form of type classes, and because al functions are pure, all other behaviour is derives from parameters. The only exception might be context outside the program, like operating system state.
Programming language theory also has other abstractions for context as well, like delimited continuations.
The programming language that I will never get around to implement would have implicit parameters, that I think would subsume contexts, dynamic scoping and effects.
The real issue is figuring out a way to statically propagate implicits up without mandatory explicit annotations or whole program inference.
I believe the Scala creators invented implicits, as an alternative to type classes that fit more naturally with an OOP language. They were later shown to be equivalent to type classes.
Curious that CDI and other dependency injection frameworks are not discussed here, I would have imagined them to be relevant. But otherwise, I think the article does highlight a real problem and I would hope that future PLs improve on this front, allowing for even more "pure" functions in some sense.
The section on exceptions got me thinking of CL condition system. Idk if it somehow meshes in here, I suspect it might provide some mechanism to inject context somehow... but I never have programmed a single line of CL so idk.
Context + reactive memo functions is a powerful way to manage global state. Context namespaces adds more resolution to grouping Context state. These patterns have served me well over the past decade in javascript/typescript development.
I created ctx-core/be[1] & ctx-core/rmemo[2] to provide Contexts, be_ functions, & reactive memos. It has been a powerful set of abstractions for application code, build logic, devops, animation orchestration, etc.
I would love to use a systems programming language that supports Contexts & compiles fast. Jai looks interesting but it's still in closed beta. With Zig, there was a proposal to dependency inject a context which contains the allocator[3]. It didn't go anywhere though. Golang has Context[4]. It compiles fast but I would like a fast language without a Garbage Collector to compliment js/ts for expensive computations. Perhaps Rust could work but the compile time is slower than ideal.
At this point, Golang seems to be the best compliment as it addresses expensive computations. Hopefully a productive system programming language without the GC will be available to have good support for contexts.
“Clang Thread Safety Analysis is a C++ language extension which warns about potential race conditions in code. The analysis is completely static (i.e. compile-time); there is no run-time overhead. The analysis is still under active development, but it is mature enough to be deployed in an industrial setting. It is being developed by Google, in collaboration with CERT/SEI, and is used extensively in Google’s internal code base.
Thread safety analysis works very much like a type system for multi-threaded programs. In addition to declaring the type of data (e.g. int, float, etc.), the programmer can (optionally) declare how access to that data is controlled in a multi-threaded environment. For example, if foo is guarded by the mutex mu, then the analysis will issue a warning whenever a piece of code reads or writes to foo without first locking mu. Similarly, if there are particular routines that should only be called by the GUI thread, then the analysis will warn if other threads call those routines.
[…]
EXCLUDES is an attribute on functions or methods, which declares that the caller must not hold the given capabilities. This annotation is used to prevent deadlock.”
Googling “static deadlock detection” will uncover similar tools (often not as well integrated with the compiler)
There's also Software-Transactional Memory in which deadlocks are pretty much impossible. Though it only really works in languages that can properly track side effects, like Haskell's monads.
Each program building it's own bespoke core models is endemic to programming. Sometimes we get as refined as "everything is a protobuf" or everything is defined in zod. But then we still have our informal ad-hoc object management or passing systems. Unless there's some inversion of control/dependency injection system normalizing this too. Yes sirree, would be nice having some powerful capable context type things in languages.
Kubernetes is somewhat distinct in how it normalizes what a broad range of infrastructure objects or intents. Context is a collection of state, and those common ways of dealing with state are the missing practice, be that at a programming language level or other systems level.
Functions make assumptions all the time and there is often no way to specify them except with type information. E. g. one can specify that a function takes a float as input but not that the float has to be in [0.0, 1.0]. Hard to catch with static code analysis.
Idea for implicit parameters:
Implicit parameters are values from somewhere up the call stack. So make an object with a get function:
Scope.get('Logger')
But then you can get anything. So add a mechanism to put things in there up the call stack:
As I understand it, there’s a difference between dynamic scope and implicit parameters. Implicit parameters don’t escape the lexically defined call stack, making it much easier to reason about.
The idea is also to tie it together with a static compiler that checks everything fits together.
The stated example, too many concerns for implementing platforms and plugins, could be solved by a meta processor that filters source code to your particular platform / database / queue / network arch / etc.
That might be something an AI could do, but I can't imagine any complex project maintaining a secondary mechanism like that
This article is a litany of issues that pure functions solve, despite the backhanded dismissal:
> Much as we like our platonic ideals, our pure functions, we ultimately have to acknowledge that every function exists in a context. The most basic context is the physical hardware on which your code is running
(Note: pure functions set such a high bar that it's a stretch to criticise them. Might as well blame the hardware instead. Which is fine, just be equally dismissive of other 'solutions'.)
> when one fails, it’s up to the engineer to figure out why. In the case of race conditions or other rare occurrences it might not even be possible to reliably reproduce the issue in a debugger,
Pure functions will return the same input for the same output. It will reproduce.
> In particular, they should be better at detecting cases where two separate units conflict with each-other.
Pure functions do not conflict with each other - no detection needed. ("Conflicts" can only arise by "passing the wrong thing" from function to function, which I don't think is the idea here, e.g. in `getPetsName = (getName . getPet)`, getName and getPet cannot conflict with each other, but if getName accidentally returns an address, then of course getPetsName will return the pet's address.
Deadlocks:
> One of the simplest and most pernicious programming problems is the humble deadlock.
> Unfortunately deadlocks are not always so easy to spot, especially as more layers of abstraction are added.
> Deadlocks are common, hard to debug, and can crash an entire app — yet our best defenses against them, integration tests, are porous and blunt.
Pure functions don't deadlock.
> The symptoms are less severe than a deadlock, but the root cause is often quite similar: a function calls a function that calls a function that calls a function that does something inappropriate.
> Compilers don't even try to help here.
It is a compile-time error to call an impure function from a pure function.
> When I call getTimestamp(user.registeredAt, TimeZone.PST) I can assume that it’s just doing some simple math and returning a result. It probably doesn’t make network calls or hold locks or mine bitcoin
No need to assume, check the type signature. Pure function. The compiler will check it for you if you forget:
> Context is only dangerous because it is absent from a function's arguments and so is opaque to the caller. The caller, for example, has no way to know if a function could cause a deadlock because it cannot know if the implementation of that function relies on locking.
> Knowing when a function modifies state is half the battle, the other half is knowing when a function reads state. This is not an Effect, but rather a Coeffect — although sometimes it all gets lumped together under the banner of ‘Effects’ or ‘Effect Systems’.
Of course it gets lumped together. That kind of reading is an effect. It can cause deadlocks, it can block your UI, it requires thinking about context, it may not be trivially reproducible.
That's enough about pure functions. The author then pivots to proposed features to support context. These don't solve the above problems, but there's plenty of prior art. So this last bit is more about Haskell and less about pure functions in general.
We have a complicated git merge process at work, with around 70 repos. I wrote a Haskell CLI tool to help me out with the process. I believe I've done exactly what the author is suggesting (explicit, compiler-assisted context markers) by using the 'tagless final' approach. Here are a couple of example type signatures:
ensureCheckedOut :: (Git m, Logger m, Monad m, StatusApi m) => Repo -> m ()
ensureRepoDeleted :: (Logger m, Monad m, Shell m, StatusApi m) => Repo -> m ()
This keeps my context in check by only allowing me to call impure functions if they are provided by one of the constraints on m. e.g. I can call logging from both, but ensureCheckedOut cannot run shell commands, and ensureRepoDeleted cannot invoke my git api (for the pedantic, it could technically use the shell to invoke git via cli.) Pure functions are of course allowed from anywhere.
In short:
* Pure functions do solve the above problems and it's a mistake not to use them wherever possible.
* Context systems do not solve deadlocks, integration issues, reproducibility, etc. But if you think they're useful, Haskell's got you covered using the 'tagless final' approach.
See also: Data, context, and interaction (DCI, 2000-) [1]
> The paradigm separates the domain model (data) from use cases (context) and Roles that objects play (interaction). DCI is complementary to model–view–controller (MVC). MVC as a pattern language is still used to separate the data and its processing from presentation. [1]
> DCI was invented by Trygve Reenskaug, also the inventor of MVC. The current formulation of DCI is mostly the work of Reenskaug and James O. Coplien. [1]
For example, if every function is allowed to allocate memory whether it does or not, then changing the implementation of a function so that now it allocates memory is a simple, backward compatible change. If you have to declare that you allocate memory (for example, by adding a new argument, as in Zig), making the same change breaks compatibility, so now you have to change all the callers.
Depending on the circumstances, that might not be considered worth the trouble. It's a reason that Java's checked exceptions aren't popular. Often, reporting a new kind of error shouldn't be a compatibility break; they're all going to get caught and logged the same way anyway.
(An extreme case would be making performance guarantees part of the API, which, outside maybe real-time environments, nobody does. This means that a new version of a library could be much slower than before, maybe causing your servers to become overloaded, without any compiler warning. But performance depend so many factors that making guarantees is usually unfeasible.)
The proposed idea of having a compiler do certain context checks without declaring anything in the API means that there are invisible API constraints that are generated from the implementation. This is sort of like having type inference without the option of declaring a function's type explicitly.