OK, go on, explain what else it is, rather than accidental convention of how to ensure an order of evaluation of a pair of expressions in a particular lazy pure-functional language?
Erlang, for example, being functional but strict language, requires no monads. So does Standard ML. In these languages a monads would be a useless, redundant abstraction which will only clutter the code.
I've been using Monads in Scala for not throwing wild exceptions, but still being able to stop the computation immediately if needed. For example you want to validate things with a Validation[A] type, which can either be Valid or Invalid. The binding function in Scala is called flatMap, so flatMap for a Valid value returns a lambda having the included value as the parameter and Invalid doesn't have a function call, so the flatMap operation stops.
Now in the main function we can handle the Valid and Invalid with pattern matching and look, we can stop the computation without throwing exceptions, which makes testing and everything way simpler.
While I really appreciate your cleverness, I would argue that this is an example of so-called overengineering or, an analogy from architecture - redundant decorations. Why do I need all this complications instead of a predicate?
OK, in some statically typed languages which perform type-inference, there is a restrictions for homogeneity of conditionals and aggregates, otherwise all your type inference falls apart. So, to address this problem we could use data simple structures, like tuples, or we could create a new data-type. The canonical example that Nothing or Just T type. Because different branches of a conditional must be of the same type - this is the type. Also for the sake of type-consistency, it is parametrized type. In old-school languages we will just return nil, or (values ....) Semantically there is no difference.
Ideally, types and semantics should not interfere. Complicating semantics in order to satisfy a type-system is a controversial idea.
As for Monads - it is just an Abstract Data Type, nothing special, in which Semantics and Type information complement each other. It has been created to keep types consistent - a parameteized type along with two procedures.
As an old-school programmer, I used to think about types as tags, the way it is in Lisps. (Of course, I know that these tags could be arranged in categories, hierarchies, and so-called "laws" could be defined). So, in my view, this is nothing but nested type-tags. It makes it easy to view Semantics and Types separately.
In Haskell a Monad has another "function", which is, in my opinion, is the reason why it was created. Along with satisfying the type-system, it also ensures an order of evaluation in a lazy language. The semantics is obvious - you evaluate an expression, and lift (type-tag) the value back into Monad, so the whole expression has type Monad T.
OK, this parametrized type is justified in Haskell, but in other languages, in my opinion, it is a redundant decoration, not a necessity.
Hardly. Erlang and OCaml function in an implicit, ambient monad which makes some default choices like "single thread of execution, sequential, deterministic state, has exceptions" and each will use monads when a different choice of monad is useful.
As a simple example, if you'll buy that by your reasoning OCaml would find monads to be a "useless, redundant abstraction which will only clutter the code", then I'd ask why both Async and Lwt are built to be monadic? Or, if you still feel that's a mistake, how you'd design them elsewise?
Or rather, I think your perspective is better handled by a language which supports monads than one without. If I want a new execution context in Haskell—it's a mere ADT, if I don't think it's useful I use something else.
My point is that everything imperative lives in a monad. If you want, you can choose to be explicit about it which opens new freedoms. Or, if you don't, you can just use the ambient one in languages which have those. Sometimes people realize great things by picking new monads. If it's convenient enough you can do it every other line to great effect.
Ambient monads can be a little annoying though since when they exist you cannot "get outside of them" which will reduce your flexibility.
It's for being able to write functions that are generic in the context they operate in. You absolutely would want to do this in Erlang, or in SML if the type system supported it.
E.g. I can use Future for managing async-ness. I can use a system a bit like http://typelevel.org/blog/2013/10/18/treelog.html for weaving statistics collection through my computation. I use Free to express database operations that need to happen in a transaction. I can use Either to handle operations that might fail.
I can write a method that builds a report from a bunch of rows that works with any of these four contexts, because all of them form monads.
That because IO implies explicit order, Monad as an ADT has been used to explicitly evaluate one expression at a time. It has no "magic" or "special mathematical" properties.
Haskell's do notation is syntactic sugar over monads which effectively allows you to write 'imperative-looking' code while still carrying a local state forward without mutation. The Wikibook[1] does a pretty good example of explaining what this looks like (though I'm guessing you already know this).
Now, obviously it is true that one of do notation's advantages is the same as any other monad usage: it allows us to explicitly sequence events in a lazy language that otherwise offers no (obviously intuitive) guarantees on evaluation order. In that sense it's nothing more than sugaring over the otherwise necessary usage of a lot of ugly >> and >>= operators everywhere in increasingly annoying indentation.
But the other thing it offers is a syntactic sugaring over carrying state forward into successive computations (like the State monad[2]), which still carries at least some useful sweetness in a language that is otherwise functionally pure, which is why F# generalized the concept even further to computation expressions[3].
Looked at another way, do notation, or something like it, can be used to sugar over something that rather more looks like the Clojure ->/->> operators, where the initial value is essentially a local namespace. Much like the threading macros, the result even appears to be doing a kind of mutation, even though it's actually doing nothing of the sort.
This kind of thing turns out to be useful for games, for instance, as the linked State monad example above does. In games we often have a main update loop, where we have to do several successive operations on our game that might change the state. We can do this a number of ways, but one way is with something like do notation, where for instance (in some hypothetical language) we might do this:
do with gameState
oldGame <- gameState
gameState <- checkInput
gameState <- tick
if gameState != oldGame
draw
And all of this kind of "fake mutation" can be handled underneath the sugar in a purely functional manner. It's something I've been meaning to put into Heresy for some time. Heresy uses continuation based loops that have a "carry value", that can be passed from one cycle to the next. It's a simple matter of some macro magic to then layer over this some syntax sugar that makes that carry value effectively a name space, that can be altered from one statement to the next, but all entirely without actual mutation underneath.
You can write whole imperative, mutation-riddled languages in purely functional ones this way. There's an implementation of BASIC that runs in the Haskell do notation.[4]
Erlang, for example, being functional but strict language, requires no monads. So does Standard ML. In these languages a monads would be a useless, redundant abstraction which will only clutter the code.