Building expression trees is not the only way to do automatic differentiation. A simpler way is to just carry the value and it's derivative as a pair:
#include <math.h>
#include <stdio.h>
struct autodiff { double value, deriv; };
autodiff just(double x) { return { x, 1 }; }
autodiff operator +(autodiff a, autodiff b) {
return { a.value + b.value, a.deriv + b.deriv };
}
autodiff operator *(autodiff a, autodiff b) {
return { a.value * b.value, a.deriv*b.value + a.value*b.deriv };
}
autodiff sin(autodiff a) {
return { sin(a.value), cos(a.value)*a.deriv };
}
int main() {
autodiff x = just(.1);
for (int ii = 0; ii<4; ii++) {
x = x*x + sin(x);
}
printf("value: %lf, deriv: %lf\n", x.value, x.deriv);
return 0;
}
There is no need to differentiate the for loop or the semicolons. This way is not doing symbolic differentiation. It's implementing the differentiation rules in parallel to calculating the values at run time.
This generalizes to partial derivatives for multivariate functions too:
This only works if you don't have x as a value determining the length of the for loop. If you try to take the derivative of x^x using a loop multiplying x times, you'll get a wrong answer (admittedly, expecting a right answer would be foolish).
But this goes to question at hand, whether AD should be a library/DSL or whether it should be a primitive of a general purpose language. The thing is a general purpose language lets you present sorts of things as functions that can't be even dual numbers won't take the derivative of correctly - a dual scheme can't distinguish variable order loops from constant order loops.
That's a fair point (and worth documenting as part of the library), but I'd be much more likely to implement pow(x, x) carefully than switch to a different language or compiler.
I appreciate the library approach and I basically agree with you. If anything, I wanted to point out that a normal general language is so general that adding differentiation into it would be highly costly - just this feature would require that every loop and every conditional return be watched (relative to pow(x,x), an even more challenging example is the Newton's method implementation of a continuous function using a loop).
An alternative would be creating a general purpose language just for this feature - an interesting though rather specialized project.
This generalizes to partial derivatives for multivariate functions too: