This seems near-proof to me that a gradual inflight translation of a codebase from ANY one language, to another, entirely different one, is feasible, as long as you build a data interop layer of some sort (JSON, etc.). I imagine that if it were powering a web app, dual-deployment would become an additional (but potentially manageable) concern during the "transition" period.
This seems a much safer/saner way to do total rewrites/refactorings.
Note that it practically demands decent (if not impeccable) test coverage (he even admits that many parts were not tested... the only saving grace being that due to the semantics of the 2 specific languages here, he was able to use roughly the same logic for the less-tested portions, reducing risk).
Also note that according to the graph, during the middle of this process, application and testing performance will be the most terrible. At that point, some managers would probably decide to back out/bail on it, which is why I thought it was important to note.
> as long as you build a data interop layer of some sort (JSON, etc.). I imagine that if it were powering a web app, dual-deployment would become an additional (but potentially manageable) concern during the "transition" period
Which brings up an interesting idea:
* All codebases are in some kind of inflight transition. Usually not migrating across languages, but often migrating across authors and authoring styles, sometimes migrating across underlying platforms. So, things that make these inflight transitions easier might well be practices one should consider adopting.
* The "data interop layer" might simply be another way of understanding another principle: if your data structures/formats are (a) legible and (b) well-fitted to your problem domain, your program is probably going to be easier to understand and modify... maybe even when it comes to modifications that might seem extreme.
Or to use words attributed to Linus Torvalds: "Bad programmers worry about the code. Good programmers worry about data structures and their relationships."
When you think about the data first you can usually synthesize the program. It's almost a game now to try and write a DOOM engine from only the description of the WAD format and its wonderful documentation. I'm under the impression that if you think about your data and how to serialize it first the code will fall out of it.
So I'm starting to think more about formal specifications and documenting serializations formats first and worrying about the code second.
This seems a much safer/saner way to do total rewrites/refactorings.
Note that it practically demands decent (if not impeccable) test coverage (he even admits that many parts were not tested... the only saving grace being that due to the semantics of the 2 specific languages here, he was able to use roughly the same logic for the less-tested portions, reducing risk).
Also note that according to the graph, during the middle of this process, application and testing performance will be the most terrible. At that point, some managers would probably decide to back out/bail on it, which is why I thought it was important to note.