In fact, this means that writing concurrent code is easier as you don’t have to worry about synchronizing access to shared state. It also means scaling applications is straightforward, since you are forced to design for horizontal scalability from the get go (similar to deploying node.js). Ive always found it strange that Go has both channels and locks as synchronization primitives.
Scaling out is natural when you have a large number of tasks, but it's not always an option when you have one big task. Maybe you need to sort a giant collection, or hash a giant input, or multiply a couple giant matrices. Threads with shared memory can get to work on those problems easily, but separate processes can't, at least not without trying to reorganize the problem to avoid doing tons of IO.
Libdill IO is nonblocking and calls returns a channel. Reading from the channel yields the coroutine, similar to the await statement in some languages.