How is TTT anything other than a deep learning algorithm? We have a deep learning model, we generate training data based on an example and use a stochastic gradient descent to update the model weights to improve its predictions according to the training data. This is a classic DL paradigm. I just don’t see why would you consider this an advancement if you your goal is to move “beyond” deep learning.