Nope, no proofs :) Formalizing questions like this is one reason why I'm interested in category theory, so I don't think I have the tools to dig into this right now. But... "side effect" literally maens it doesn't show up in a normal input/output function signature, and in a pure functional language like Haskell, there are no side effects. Monads are a particular way of explicitly capturing side effects as a "separate" kind of thing from the function output using a particular species of functor.
I suppose my statement was a bit strong in that regard. Monads tend to arise very often in the way we build systems, but that doesn't mean they're the only way to cope with side-effects. It does seem likely that any way to capture "alternate outputs" from a function will end up looking like A -> T(B) in some category, though.
(Incidentally, if "A -> T(B)" looks funky, think about polynomials `f(x)` -- the function symbol `f` is just a placeholder for some expression involving the free variable `x`. Could be `A -> (B, String)`. The monad laws only make sure you have some foundation for reasoning about the extra type structure being added in by `T`.)
First, a clarification: when looking at a monadic action `f : A -> T(B)`, the monad here is just `T`. The action itself is still just a function. The monad `T` lets you add some extra structure onto your usual output type in a principled way.
That being said, it's true that I'm using `T` itself in a function-like way. Category theory does blur the lines between the two ideas: monads are "functors" with some extra structure, and functors are ("just") functions between categories that preserve categorical structure. But it's critical to notice here (and it's apparent from the type `f : A -> T(B)`) that `T` is not the same kind of function that `f` is. `f` lets you move from one type to another, by mapping each value in one to a value of another. `T` lets you move from a whole category to another category, by mapping each object/type in one to an object/type of another, and mapping each arrow/function in one to an arrow/function of another.
In other words, monads occur at the type level, whereas functions occur at the value level [1]. That means that, colloquially, you can't just use a monad "instead of" a function, any more than you can use "Integer" instead of "42". But as I alluded to above, monads over partial orders are closure operators, and we can often model the evolution of data over time as a partial order. So in that domain, monads literally are functions, and the "side effect" of a closure operator is mutably updating a cell by moving its contents up in the order.
If you model the evolution of data as a partial order, you can indeed obtain monads that more closely resemble normal functions. But a partial order is just a particular kind of category, so even here we've built a separate little domain over which our monad exists. (Of course, functors are all about crossing those domains in principled ways.)
[1] That's why it's not "IO ()" or "Maybe Int" that are monads; it's "IO" and "Maybe" themselves, which are well-behaved functions from types to types.
I quite belatedly realized that you come from Category Theory, so most of my sibling reply is probably old news to you. Sorry!
The most charitable response would be "yes", but it really depends on how you model your domain. Most instances of monads in software are at the type level, not at the value level, and "function" doesn't usually make sense at the type level.
The example I gave of a monad over posets does arise in the semantics of logic programming, but I haven't seen explicit recognition of monads in that area. Not that I've looked that hard, yet.
I suppose my statement was a bit strong in that regard. Monads tend to arise very often in the way we build systems, but that doesn't mean they're the only way to cope with side-effects. It does seem likely that any way to capture "alternate outputs" from a function will end up looking like A -> T(B) in some category, though.
(Incidentally, if "A -> T(B)" looks funky, think about polynomials `f(x)` -- the function symbol `f` is just a placeholder for some expression involving the free variable `x`. Could be `A -> (B, String)`. The monad laws only make sure you have some foundation for reasoning about the extra type structure being added in by `T`.)