"Nondeterministic" in automata is a very specialized word that doesn't refer to anything like a random process in a stochastic universe.
It's a representation for state machines in which state transitions are apparently ambiguous: a given state in the NFA can, for exactly the same input, go into multiple other states. This is a very convenient representation for pattern matching.
The way the ambiguity resolves itself is not through randomness, but by the understanding that the machine is in all of the possible states at the same time. Then, in a concrete implementation in a programming language, we represent that situation by using a set of states as the run-time state.
E.g. when the input symbol b is received, some NFA machine goes from state { 0, 3, 5 } to { 1, 3, 7, 8 }; i.e. it is in NFA states 0, 3 and 5, which transition to 1, 3, 7, 8 when b is seen.
NFA graphs can be executed directly by an NFA simulator which calculates these transitions on the fly.
We can also statically figure out all the state sets there can be and their transitions.Then we re-label these sets as the simple states of a deterministic automaton. E.g. { 0, 3, 5 } is dubbed S0, { 1, 3, 7, 8} is dubbed S1. There is a transition from S0 to S1 on b. We can work through all the possible transitions, ferret out all the subsets and their transitions to build a simple machine. That is the "subset construction".
Yes for regular languages but not for higher level ones. For example, deterministic context-free is a subset of context-free. For languages that are turing complete, the question is less about ability to compute and more about the speed at which something can be computed. This is an unsolved problem known as P vs. NP.