Those names are not very good. return takes a value and "monadifies" it. If you have function that takes a normal argument and emits a monadified value (e.g. int -> monad string). Then bind can transform your function so that it takes a monadified argument (e.g. monad int -> monad string) instead.
return and bind satisfy three laws:
1. If you have function that takes a normal argument and emits a monadified value. You could apply it to some argument right. Or you could monadify the argument with return, and then use bind on the function so it will accept the monadified argument. These two ways of using the function give the same result.
2. Remember how return takes an argument, and emits a monadified value? That means we can bind it! The second law states that the resulting function is just the identity for monadified stuff.
3. Lets say you have two functions f (e.g. float -> monad string) and g (e.g. string -> monad int) from normal argument to monadified value. We could bind them both and then compose them.
g_ = bind(g)
f_ = bind(f)
g_f_(x) = g_(f_(x))
Or we could imagine instead binding only the second function, then composing the functions, and then monadifying the composition.
g_ = bind(g)
g_f(x) = g_(f(x))
g_f_ = bind(g_f)
Law 3 states that these two procedures yield the same g_f_
Having offered this kindly explanation, would you say that this continues in the holy pursuit of a 'simple explanation' of what a monad is?
Here's an intermediate question triggered on your explanation: what is the byte level nature of your "monadification"? you say one has a function which converts a floating point to a "monad string", a string value to a "monad int" (a raven into a 'monad writing desk')? exactly what bytes characterize a "monad string" from a string? a "monad int" from a plain old int? it's perhaps a structure? maybe it has stack pointers among those structural elements?
If 'm' is a monad then the difference between a 'String' and a 'm String' (monad string) is most likely just a constructor, but it's really up to that particular monad's implementation. It isn't a byte level difference, it's a context level difference. Constructors are very different in Haskell than (insert imperative Lang). They are much simpler. Monads are similarly simple but hard to explain without any familiarity with the language. I suggest you just dive in. The cool thing about Haskell is you don't have to understand everything in order to work with it.
There are many monads, so the answer will depend on which one it is. For instance the "Maybe" monad I described in a sibling comment to yours could be implemented with a pointer that could be null, or not.
Another Monad could be a Logger monad. "return" transform a value into a pair of a value and an empty log. So in this case a monad int could be implemented as a pair<int, string>. (bind would take a function that starts logging from scratch and make it a function that appends to a preexisting log).
It's really, really vital to note that "monad" is not a noun but an adjective. Your question thus has no answer as stated.
Let's pick a type which happens to be monadic, though, like list. If `A` is a type and `List A` is the type of singly-linked lists with values in `A` then `fun a -> Cons a Nil` is what's sometimes called `return` and `flatten :: List (List A) -> List A` is what's sometimes called `join` and (List, return, join) is monadic.
And anyway, the byte-structure of SLLs can vary from place to place, but any good implementation that works generically will do.
Since you raised the linguistic characterization, (..'monad' is an adjective, not a noun, so your question doesn't apply), you'd therefore say there's "thus no answer" to the question: what's the difference between a "blue envelope" and a basic envelope? ('blue' being an adjective)
In my experience, here's often what's lacking with teaching the latest in programming formalisms: folks need the new concepts explained in some terms of the old in order to understand them, (hence why i was trying to get someone to explain monads in terms of crusty old terms like (C) structures). The frustrating tendency always seems to be to concoct a host of new terms, and then to define them only among those unfamiliar terms; often giving well-intentioned detailed examples yet only using those novel terms and syntax. It would be akin to teaching a foreign language but not even trying to pick up a familiar object and point to it in association with saying the new term for it.
By the bye, i accept and appreciate the explanation given somewhere here (likely by 'im3w1l'), that monadization generates a different structure for different incidences of monadification. This helps conceptualizing the mechanism, in a way that "binding of flattened functors returning null or non-null monads" doesn't. It's worthy to assume that we didn't all spring forth full grown out of the forehead of a particular computer science theory class.
Nonetheless, the efforts are appreciated. Even given the root commenter here promised that monad was a "simple concept" with a simple definition. If you scan about this comment section, do you think this view is well supported?
I agree with you, but think monad is a poor place to start such a discourse. It's a "simple" concept once you've gotten some other ideas firmly cemented.
I also think there's the standard "simple/easy" dichotomy going on here. Monad is SO simple that it can be quite hard to grok in the same way that quantum physics is.
I fully agree with im3w1l's note about different things coming from different "monadifications" and find that completely consistent with what I've been saying. I'm not sure who uttered the other quote but it's as misleading as it is nonsensical. I think that's the risk with explaining this stuff partially anyway: simplicity isn't necessarily easy and there are lots of partial understandings of "monad" floating around.
hm... "it's so simple it's hard to understand". i'm also very familiar with the 'monads can't be explained until later!' concept. yet folks who say they understand monads so often make heavy use of the term considerably earlier than "later".
"quantum physics" is an interesting reference for me. as it's often my job to implement quantum mechanics in software. computers are dumb. they 'understand' little beyond patches of memory upon which simple algebraic operations are done. so when i'm teaching quantum mechanics i often find it useful to explain how one applies it for (dumb) computers. "ooooh! second order differential equations are implemented as simple iterative passes over an array of floating point values - that's easy!" the student comes to see through the notable complexity of quantum mechanics by observing how we 'teach' it to a computer. why the hell do i bore you-all with this? well, it's possible that showing how one implements a monad in terms of dumb patches of memory, dumb interrupt vectors, bits and bytes, might not be a bad starting point for a common language ..maybe. anyway thank you for trying (it won't be the last time monads are null-functor flattened re-re-re-hashed)
Fair enough, and I also provided a "raw patches of memory" example earlier provided that you are willing to buy linked lists as common language :)
QM via approximation is an interesting point! I suppose in that sense my metaphor breaks down. Approximation works in some fashions, but you need notions of convergence to make that go. These cannot be had in describing monad-nature.
I'd definitely say that once you "have" the concept it becomes standard and nigh universally useful vocabulary. You want to use it a lot because it makes a lot of sense to do so, but this isn't a good didactic method.
I also kind of want to argue generally against the idea that computers are just dumb machines capable only of shuffling memory around. Of course to a certain degree this is true, but only in the same way that algebra is just a series of symbolic algorithms. It's true, but there's nothing interesting to be had from that POV.
The fun stuff occurs when you take the perspective that what's going on inside the model represents faithfully something more interesting going on inside our own heads or out in the world.
Monads are a powerful, simple and subtle thing which happen entirely abstractly—it's just up to us as humans to recognize the pattern (or, equivalently, up to automated computer algebra pattern inference machines to do the same).
Also with your edit it's worth noting that your given fmap cannot be: it violates both functor laws. As it turns out, for Haskell and most datatype-like functors there is exactly one law abiding implementation of fmap implicit in the structure of the type.
Sorry, that's a good point. The triple formalism (T, mu, eta) already assumes T to be a functor, but I didn't state that explicitly! So you need (List, map, return, flatten) all of them.
Ok let me give you an example of a monad! The Maybe monad. In this case a monadified value either has a normal value, or it has a null. Ok, so for the operations. "return" should monadify a value. In this case it does this by creating a Maybe that has that value (Maybes that have null can not be created with return. They must be created in some other way.). The bind operation creates a new function that takes a Maybe value from one that doesn't. If the Maybe has a normal value then the function is run like normal. But if the Maybe has a null, the function is not run. Instead null is emitted directly.
Why is this a good thing? It means you can chain together many failure prone operations, and you only have to check for null at the end, not in the middle. So it works a little bit like exceptions.
Let's verify the laws:
1. We have some value, and some function that could take that value, and either emit a result, or a null. If we monadify our value, we get a Maybe that has that value. If we bind our function, and then run it with our Maybe, what happens? Well we know our Maybe is not null, because we just put in a totally legit value. So because of how we defined our bind, the function will run just like normal.
2. Ok so lets bind "return" and see what happens. If the input is a maybe that has null, then we never run "return" but just emit null. Consistent with identity!
If the input is a maybe with a normal value, then we run "return" with that value. Return creates a maybe that has that value. So we get out what we put in, it works!
3. There are a number of cases we have to go through but let's do them one by one.
First construction:
a) x is null. f_ skips f and just emits null. g_ skips g and just emits null.
b) x is non-null. first we put x in f_. Since x is not null we just apply f to the normal value. Now f can either emit a normal value or null.
b1) f emits null. g_ is skipped result is null.
b2) If result is not null, g is run with the normal result, and result of overall computation is g(f(normal argument))
Second construction:
a) x is null. g_f_ skips g_f and just emits null. Same result as before
b) x is not null. g_f is run with value in x. To run g_f we first run f. Two cases. f emits normal value or null.
b1) f emits null. g_ of null is null. So if value in x causes f to emit null we have null just as before.
b2) f does not emit null. g_ just runs g with normal result. So if value in x does not cause f to emit null, result of computation is g(f(normal argument))
So both constructions work the same, they continue running until problem arises.
return and bind satisfy three laws:
1. If you have function that takes a normal argument and emits a monadified value. You could apply it to some argument right. Or you could monadify the argument with return, and then use bind on the function so it will accept the monadified argument. These two ways of using the function give the same result.
2. Remember how return takes an argument, and emits a monadified value? That means we can bind it! The second law states that the resulting function is just the identity for monadified stuff.
3. Lets say you have two functions f (e.g. float -> monad string) and g (e.g. string -> monad int) from normal argument to monadified value. We could bind them both and then compose them.
Or we could imagine instead binding only the second function, then composing the functions, and then monadifying the composition. Law 3 states that these two procedures yield the same g_f_