I was excitedly reading through this article eager to learn how the solution worked. Instead, there was a bunch of (to me, as a mathematician) uninteresting detail about how to massage the data with Python. Once the data was in the right shape, feed it to something we will treat as a black box.
I get it that sometimes this is ok. It's perfectly fine to not care about how everything works. I am just disappointed that a blog post about the TSP doesn't contain any actual details about how to solve the TSP. If I were to write such a blog post (and I have written things of this ilk), I would spend a lot more time trying to elucidate the solver's algorithm. I like explaning algorithms.[1] That's how I feel that I've really understood a particular subject.
I suppose overall this makes me quite a different sort of person than the author. I could never tolerate running Mac OS X for any length of time, because being inconvenienced to use the debugger I want (Mac OS X's signing makes it very annoying to run gdb) and being unable to put debugging calls into my OS kernel are unacceptable compromises for me. But people who like black boxes seem to really like black boxes all the way down to the OS they're using.
The map doesn't actually show the optimal path. It is proven that "loops" (the ways crossing each other) can be straightened, and the new version will be shorter. There is one east of Alberquerque.
That's an interesting result; do you happen to have a citation? I couldn't find one with a few minutes of Google Scholar.
If I understand Concorde's claims correctly, there is still the question of finite numerical precision (it doesn't seem to use MPFR or any other arbitrary precision library). Perhaps the suboptimality of the path is less than 1e-7 or 1e-16 (depending on precision) times the distance between the "looped" cities?
Having said that, one of the authors of Concorde is R. Bixby, a co-author of CPLEX (which was for decades, and may still be, the industry standard LP solver including for branch-and-bound problems). And Chvatal is another very widely regarded LP researcher. So I would take Concorde's claims of optimality at face value (though of course there could be a data input error somewhere).
Edit: Ah, I misunderstood the sense of "loop"; I thought there was a subcircuit (which I believe can be optimal in some cases), but instead there are two segments crossing each other that, per wrk1's comment below, should really be shorter if their destinations were "swapped". Rough Google-maps math suggests that would reduce the distance by ~10 miles out of ~16k, which seems well above numerical precision.
In operations research, it's common to stop when a solution is provably within 1e-4 of optimum. Off the top of my head, reasons include: we don't want to optimize FP error, limited precision in the input data, and negligible real world impact.
That said, it's also well known that non-OR practitioners have less confidence in our results when there are trivial local suboptimalities, in some cases even when they don't affect the objective function (e.g., off the critical path in a scheduling problem); I've heard of several professionals who pass the output of exact (modulo stopping criteria) methods through stupid local searches just for that reason.
Concorde produces a provably optimal tour, but it follows the TSPLIB input format and requires that all distances be integers. There will thus be rounding error in converting the geodesic distances to integers. To obtain greater precision, the geodesic distances should be scaled to meters rather than kilometers.
I don't have a citation at hand, but the proof is simple enough. It does rely on the triangle inequality and can be used to construct a quick approximator for this Euclidean variation of TSP. Granted, the math is different on a surface of a sphere. The error could be caused by imprecise input values.
Very nice blog post Mortada! I used Concorde to build an optimal tour where the geodesic distances are scaled to meters (rather than kilometers), using the TSPLIB GEOM norm. Here is a Google Map drawing http://www.math.uwaterloo.ca/tsp/tours/tesla/tesla196_tour.h... If you zoom in, you will see that it does not have the crossings that are mentioned in the comments. The data set can be found at http://www.math.uwaterloo.ca/tsp/tours/tesla/tesla196.tsp The Concorde run time on my Mac Mini was 0.88 seconds.
Tesla superchargers lend themselves well to this problem because of the way the sites were selected. Tesla obviously wants these sites to form clear routes, and they advertise when they hit certain milestones ("NY to San Francisco all on the supercharger network!"). Which explains why the optimal route looks so nice. Very interesting read, well done.
This demonstrates that jsprit and GraphHopper combined (both open source) can be used to achieve similar performance with a lot more precise output (due to read real world data) and if you need to calculate the optimal route with multiple vehicles, time windows, capacity etc that is also not a problem. Also not by bike ;)
The mentioned Christofides algorithm only works for metric TSP. In metric TSP the edges satisfy triangle inequality.
There doesn't exist any polynomial approximation algorithm for general TSP. If it existed we would be able to solve existence of Hamiltonian circuit in polynomial time by a simple reduction and therefore would be able to prove that P = NP.
> Note that we are making the simplifying assumptions that the Earth is a perfect sphere, and that the distance is a simple Euclidean distance, instead of a driving distance. Although one can certainly plug in a different distance metric and follow the same procedure outlined here.
I think that the deformation of the Earths surface are not important, and just change the values but not the metric properties.
But the driving distance (or driving time) are not long a metric, in particular the driving distance between A and B is not equal to the driving distance between B an A. Anyway, the superchargers are so far away that the polynomial algorithm will give the correct result (probably).
I think that the deformation of the Earths surface are not important
This probably only matters if the distances are larger. Probably larger than the distance a Tesla can go on a single charge. And so can be dropped from consideration.
It's important to take into account up/down distances and wind conditions to accurately estimate energy consumption (or range) of a Tesla over a given route.
Question: At the moment, you have Columbus -> Dayton -> Lima -> ... -> Indianapolis -> Cincinnati. What's the extra distance travelled if you were to go Columbus -> Lime -> ... -> Indianapolis -> Dayton -> Cincinnati? The second option just looks like a shorter path, so I'm curious.
Not to mention the situation in New Mexico, which is trivially suboptimal by the triangle inequality.
I think there was either some kind of rounding error getting data into the TSP solver, Concorde is just giving an approximate solution, or the data preparing code in python has a bug.
Disclosure / for dang: I edited the title per my interpretation of the guidelines. In context it's not misleading, but on HN I figured "The Traveling Tesla Salesman" in isolation would cause many mistaken clicks and could be taken to be linkbait.
Dan, please edit back (and bury this comment) if I overstepped :)
I get it that sometimes this is ok. It's perfectly fine to not care about how everything works. I am just disappointed that a blog post about the TSP doesn't contain any actual details about how to solve the TSP. If I were to write such a blog post (and I have written things of this ilk), I would spend a lot more time trying to elucidate the solver's algorithm. I like explaning algorithms.[1] That's how I feel that I've really understood a particular subject.
I suppose overall this makes me quite a different sort of person than the author. I could never tolerate running Mac OS X for any length of time, because being inconvenienced to use the debugger I want (Mac OS X's signing makes it very annoying to run gdb) and being unable to put debugging calls into my OS kernel are unacceptable compromises for me. But people who like black boxes seem to really like black boxes all the way down to the OS they're using.
[1] https://en.wikipedia.org/wiki/Medcouple#Fast_algorithm