"Amazon EMR has been adding Spark runtime improvements since EMR 5.24, and discussed them in Optimizing Spark Performance. EMR 5.28 features several new improvements."
Have these improvements been contributed back to Spark? When I take a look at the improvements themselves, it looks like all Amazon did was upgrade Spark from 2.3 to 2.4.
EMR isn't open source but Spark is. What does the EMR Spark Runtime if not offer Spark as a service? And the changes to optimize spark runtime, why were they not contributed back to upstream Spark?
This is just an example. I'm sure there are many others. The developers that take the time to contribute to Spark are making Spark a better product. Amazon is not making it better. Amazon should not claim they made improvements to Spark in a newer version. What they did was upgrade Spark to 2.4 and claim that the improvements were done by them whereas in reality they were done by the community.
Again this is all wild assumption. They're not required to contribute to free software, but they do upstream changes.
The improvements they're claiming is for their own EMR product, not to Spark, and they do make updates to software to run better on their own own infrastructure. That's what their customers care about.
You seem dedicated to believing that Amazon has done nothing for open-source even I pointed you to an exhaustive list of their contributions. At this point, there's nothing more to say.
"Amazon EMR has been adding Spark runtime improvements since EMR 5.24, and discussed them in Optimizing Spark Performance. EMR 5.28 features several new improvements."
Have these improvements been contributed back to Spark? When I take a look at the improvements themselves, it looks like all Amazon did was upgrade Spark from 2.3 to 2.4.