You could do AIXI: "The AIXI formalism says roughly to consider all possible com...

You could do AIXI: "The AIXI formalism says roughly to consider all possible computable models of the environment, Bayes-update them on past experiences, and use the resulting updated predictions to model the expected sensory reward of all possible strategies."

http://wiki.lesswrong.com/wiki/AIXI

So you could use the time loop to run through this algorithm.

But what's this about predictions? Why predict when you have a time machine? All you really need is a reward function and this algorithm:

- Get a message from the future with its best-ever reward and how it achieved it.

- Try something else

- When you get to the future, see whether you've done better than the best ever. If so, replace the best ever.

- With each iteration keep a count. If you've reached some maximum number of tries without improving the best score ever, do the best strategy one last time and quit.

It might be tricky figuring out the best span of time to use, and whether you can do mini-loops within bigger loops somehow.