The two key insights that drive the game-physics approach of PDB(which follow decades of spaghetti-on-the-wall) essentially come down to: choosing a source of error that can be controlled, and not throwing away information too readily.
You end up using position because you can then solve for "only give me an answer with a valid position" - addressing it through motion makes it an indirect process, and then errors become subject to positive feedback loops. This biases PDB towards losing energy, but that's desirable for stability, and XPDB reduces the margin of error.
You avoid throwing away information by being cautious about when you "go forward" with a solution to the next timestep, and possibly keeping multiple solution sets alive to pick them heuristically. This is something you can do extensively when you are aiming for simple physics with abstract dynamics(platforming games, fighting games, etc.) - you know what kinds of solutions will "look right" already, therefore test all of them, make a ranking, backtrack as needed. When realism is needed the principle still works - you can still rank solutions by making up a metric - it's just made more complicated by the number of answers you get with complex dynamics. That explains why XPDB moves away from "substepping" the physics: it's more important to "go wide" and scan for a single, high-quality solution than to try to linearize each aspect and hope that using smaller steps will reduce the error for you, which was a common approach for abstract dynamics and resulted in biasing like "x axis movement is favored over y". The secret sauce in XPDB's design is in getting the desired qualities in a more analytic fashion, without so much brute force computation.
> That explains why XPDB moves away from "substepping" the physics
Interestingly, XPBD has moved back to substepping! The relatively recent "Small Steps in Physics Simulation" from Nvidia goes into it, but I can outline the reasoning briefly.
In a physics simulation, there are 2 main sources of error, the integrator and the solver. Breaking that down a bit:
The integrator is an algorithm to numerically integrate the equations of motion. Some possibly familiar integrators are Euler, Verlet and Runge-Kutta. Euler is a simple integrator which has a relatively high error (the error scales linear with timestep size). The most common version of Runge-Kutta is more complex, but scales error with the 4th power of the timestep.
The solver comes into play because the most stable flavors of integrator (so-called implicit or backwards integrators) spit out a nonlinear system of equations you need to solve each physics frame. Solving a nonlinear system to high accuracy is a difficult iterative process with its own zoo of algorithms.
XPBD uses an implicit Euler-esque integrator and a simple, but relatively inefficient, Projected Gauss-Seidel solver. For most games, the linear error from the integrator is ugly but acceptable when running at 60 or even 30 frames a second. Unfortunately, for the solver, you have to spend quite a bit of time iterating to get that error low enough. The big insight from the "Small Steps" paper is that the difficulty of the nonlinear equations spat out by the integrator scales with the square of timestep (more or less -- nonlinear analysis is complicated). So if you double your physics framerate, you only have to spend a quarter of the time per frame in the solver! It turns out generally the best thing to do is actually run a single measly iteration of the solver each physics frame, and just fill your performance budget by increasing your physics frames-per-second. This ends up reducing both integrator and solver errors at no extra cost.
You end up using position because you can then solve for "only give me an answer with a valid position" - addressing it through motion makes it an indirect process, and then errors become subject to positive feedback loops. This biases PDB towards losing energy, but that's desirable for stability, and XPDB reduces the margin of error.
You avoid throwing away information by being cautious about when you "go forward" with a solution to the next timestep, and possibly keeping multiple solution sets alive to pick them heuristically. This is something you can do extensively when you are aiming for simple physics with abstract dynamics(platforming games, fighting games, etc.) - you know what kinds of solutions will "look right" already, therefore test all of them, make a ranking, backtrack as needed. When realism is needed the principle still works - you can still rank solutions by making up a metric - it's just made more complicated by the number of answers you get with complex dynamics. That explains why XPDB moves away from "substepping" the physics: it's more important to "go wide" and scan for a single, high-quality solution than to try to linearize each aspect and hope that using smaller steps will reduce the error for you, which was a common approach for abstract dynamics and resulted in biasing like "x axis movement is favored over y". The secret sauce in XPDB's design is in getting the desired qualities in a more analytic fashion, without so much brute force computation.