Unless an application is pretty poorly written, anything that is safe to self-tune isn't a configuration option in the first place. The crux of the issue is that the system/application doesn't have enough perspective to understand if it should self-tune one direction or the other.
For example, if an app runs out of file descriptors, is that happening because of normal conditions or is something wrong? Increasing the max blindly until the issue goes away is rarely the right answer.
Each self-tuning app would have to have logic more complex than the business logic itself to understand how it will interact in the environment it's running in (other apps, hardware, expected traffic bursts, etc).
This entire article is pretty shallow on the things it attacks. Take the following: "Why does the JVM still need messing around GC settings to get acceptable performance/memory usage?"
The answer to this should be obvious to anyone that has developed high-performance Java apps. There is no possible way for the garbage collector to understand what your application is doing to guess the optimal times to interrupt and collect. Unless your app is a tiny state machine, the garbage collector trying to self-tune based on runtime behavior is going to make your performance worthlessly unpredictable.
> Each self-tuning app would have to have logic more complex than the business logic itself to understand how it will interact in the environment it's running in
That's absolutely false. You can have a rather simple program that compares your app performance with different system settings and tries to find the optimum. It can even use random search, or maybe something more complex like an evolutionary algorithm. Such a program isn't aware about the complexity of your app, it just needs a way to measure its performance, whatever it is.
I'm pretty sure I've read about a system like that.
You're at least over-optimistic (certainly more wrong than GP).
How are you measuring the performance of the app, then? Depending on your software, it might be throughput, it might be latency, it might be failure-related, etc etc.
The means with which the computer is supposed to work out 'oh, this app should be tuned for latency' is missing; and in many cases they can't know it (data-dependent etc). That's the hard bit. You have no way of working out 'better'.
Even if you have a good metric (user specified etc) you're still going to suffer, because there are many failure modes you won't have thought of. For example, say you're running an application and a database. It could easily be the case that the database taking too much memory slows the application down, and improves average latency by reducing contention. That would be a positive for the database, but a negative overall.
So, you'd need to have overall knowledge of a system. Then things get really complicated.
Where stuff can be optimised programatically, it generally already is; there's no setting for it. For example, the JIT settings in most runtimes are not exposed to the user (unless they dig deep). The remaining configuration settings should be things that the system can't deduce itself, and so needs to be told.
Trying to 'optimise' this without knowing what needs to be optimised is silly.
On a typical app, the user will have to trade off: peak/average memory, peak/average cpu, file storage usage, network, latency, throughput, cache behaviour, tolerance to pauses, and somehow produce a viable weighting function out of all of this.
This function will need a tonne of refinement, because the initial weighting or three will lead to wastage, and it will end up being complex and nonlinear. The optimisation process will be expensive and time-consuming.
The tuning process will be basically voodoo, harder than easily documentable configurations. Right now, experience can lead to dramatic improvements with ease. At the moment, tuning GC will also give you pointers for how to improve your program; and tuning your program will also give you pointers for tuning GC. That'll go.
Then, we get the same problems as before wrt contention for system resources, so the system has to somehow be aware of all the other programs running on the computer.
This all ends up being very difficult; much harder than performance tuning.
Again, things that the system can work out on its own, the user shouldn't specify. Things that the system can't work out on its own, the user should specify, or the system should make a good guess.
>I'm pretty sure I've read about a system like that.
Cut back the confrontation when you can't even point to a specific system that does this successfully.
>You can have a rather simple program that compares your app performance with different system settings and tries to find the optimum.
The optimum what? How does it know it's not just reached a local maximum? How does it know that the current inputs aren't just optimal for those settings and that different requests could wreck everything?
A robotic analogue is tuning PID loops. There exists loads of literature on the subject, but it turns out that it's quite hard, despite the relatively simplicity of the control law [1]. It's the underlying dynamics that results in no golden rule of PID tuning if you start the system from a random state.
The whole computer stack is similarly complex, but higher dimension and it's not clear what metric you would tune against (it would be application specific). Not that I think it's impossible, lots of regions of computer science are self-adaptive (TCP, splay tree). Its just the ensemble is a mega chaotic space. Maybe someone will hook up a DBN and skynet it soon.
what systems other than temperature control have you used an auto-tune with?
I attempted to use a PID auto-tune on a hydro electric turbine once and watched (with horror) as it stroked the wicket gates open and closed with 10s period for about 30s before the wicket gate linkage actually broke. The linkage breaking was due to improperly set 0-point but it became clear that auto-tune isn't appropriate for all systems, and holy hell I'd be damn sure I knew what an auto-tune was going to do before I unleashed it on actual equipment again. Probably I would want to specify the min and max for the CV ... more knobs that need to be set.
In my avergage hydro-electric plant there are 320 constants or "knobs" for the system plus 170 for each unit, so 1000 parameters that need to be determined during programming or commissioning for a 4-unit plant.
Very good point. Outside of temperature control, I have not seen these used. At my workplace, we have a motor and controller that are presently unusable because the supplier can't come up with workable PID settings. Fortunately it's a demo unit for a prototype, so nobody's out any money.
* Lack of information -- in order to pick the right parameters, the system needs to know what you're trying to achieve (e.g., do you want to minimize latency or maximize throughput?). Communicating that more effectively than just specifying parameters is challenging.
* Instead of telling the system what you want, it can try to figure it out -- to some extent -- by observing program behavior and dynamically fiddling with value. The problem with this approach is that it creates feedback loops that muddle things beyond repair.
* Optimization over a high dimensional space is hard. As an example (even with just one dimension), see this bit from Doug Lea's talk about why dynamic tuning of spinning vs. blocking on concurrency constructs doesn't work: https://youtu.be/sq0MX3fHkro?t=39m48s
* Writing systems that never reboot[1] and are efficient is very hard. Dynamic variables usually require inserting checks at runtime that can't be optimized away by an AOT compiler. This is where a speculatively-optimizing JIT comes in handy, but JITs aren't appropriate for all uses, and even where they are, good optimizing JITs are notoriously hard to write.
--
> Why does the JVM still need messing around GC settings to get acceptable performance/memory usage?
OpenJDK's HotSpot does fairly reasonable auto-tuning now (known as "GC ergonomics"). You can pick either the throughput collector and maybe set your young-gen ratio, or G1 and set a maximum-pause goal. Along with the maximum heap size, those are just three values, one of which is binary, another is often unnecessary and the third can be a very rough estimation. This should be more than enough for the vast majority of systems, certainly with Java 8. Much of the GC tuning parameters you see in the wild are old remnants from before GC ergonomics, that people are afraid to pull out.
Well, In my opinion configurable autotuning exists. Do not confuse a server that never reboots, with a system that never stops.
For example the AWS services integrate metrics to define behaviors (scale up, scale down, etc), you can implement pretty convenient autotuning algorithms, even those that learn from previous metrics.
The Linux kernel networking and I/O layers are plenty of autotuning code. Powertop is just another example maybe. Even a perl interpreter makes some dark magic.
If I think on my own systems implementation, I think I have do noob autotuning too (alert control, alert frequency, recovery steps, recovery notification... systems increasing their swap before a cron runs, dynamic filtering tables based on app/network input, sysctl autotuning daemons.... etc...)
There are too much to improve regarding autonomous systems, but autotuning is just about programing programmable components, for react on events. Depending on those components, you have to coordinate some kind of events or very different ones. From IRQ interrupts, to a cache hit in the other end of the connection, there are a plenty of parts to implement autotuning, from many vendors, and maybe, this is the root of the issue.
> * Writing systems that never reboot[1] and are efficient is very hard. Dynamic variables usually require inserting checks at runtime that can't be optimized away by an AOT compiler. This is where a speculatively-optimizing JIT comes in handy, but JITs aren't appropriate for all uses, and even where they are, good optimizing JITs are notoriously hard to write.
But special case JITs that essentially build a version of a function out of predefined blocks, are not that hard to implement. That's enough to remove current (constant) state dependant branches and do constant folding.
Useful for parsing, some generic data structures and graphics. Anything that is executed very many times, has configurable (but constant) state that controls conditional branching or a part of computation that could be constant folded.
Wasn't there a Linux kernel contributor with a medical background who used to push unsuccessfully for it to have more self-tuning/homeostatic behaviour? (It wasn't Greg Kroah-Hartmann was it?)
There was, but it's not Greg. He's still actively involved as a major maintainer (namely, the stable version releases).
You are thinking of Con Kolivas [0]. I remember when he was proposing a self-learning scheduler. The necessary feedback loop was considered far too brittle and the calculations both hard to reason about, as well as harder to understand.
There was also an awful lot of politics involved. I can't blame him for getting fed up.
There are lots of self-tuning systems in industrial control now. Some of the theory overlaps with machine learning. The theory is really hard, so hard that control theory PhDs are struggling deciding what math to learn.
I get IEEE Control Systems Technology magazine, but I don't understand most of it any more.
I think the author does a great job at venting frustration at the state of autotuning systems research, though I would disagree that the research interest has dried up. On the contrary, autotuning research is alive and kicking, the problem is that there has been few attempts to unify all the competing systems that exist (with some exceptions [1]). As such, the state of autotuning is fragmented, with no one approach able to achieve the critical mass needed to hit the mainstream.
Self-tuning systems are inherently complex and opaque. Execution tends to be non-deterministic and irreproducible, and so the burden of testing is far greater.
It takes some discipline to write elegant, loosely coupled code that is self-tuning. I work on scientific software, where reproducibility and reusability is (…or at least should be) paramount, and where black boxes are evil. It's hard not to build black boxes when you're writing self-tuning software.
JIT means Just-in-Time. In this context for native compilation just in time before execution. Production JITs are usually indeed self-tuning. But this self-tuning has nothing to do with memory allocation or garbage collection. JITs can allocate memory manually, many JITs don't have garbage collection at all.
When you read a value from a collection, move it towards the front.
Requires that order not be important, of course. Ditto, it's more trouble than it's generally worth if you're doing any sort of multithreading, or if you're using something where writes are much slower than reads.
It's also handy for most hash tables. (Though generally you should be using a Cuckoo hashtable anyways)
There are optimizes for jvm flags. One of them is called Groningen and it will optimize your GC parameters for throughout or pause duration of other custom goals that you provide. I believe it is a genetic algorithm.
For example, if an app runs out of file descriptors, is that happening because of normal conditions or is something wrong? Increasing the max blindly until the issue goes away is rarely the right answer.
Each self-tuning app would have to have logic more complex than the business logic itself to understand how it will interact in the environment it's running in (other apps, hardware, expected traffic bursts, etc).
This entire article is pretty shallow on the things it attacks. Take the following: "Why does the JVM still need messing around GC settings to get acceptable performance/memory usage?"
The answer to this should be obvious to anyone that has developed high-performance Java apps. There is no possible way for the garbage collector to understand what your application is doing to guess the optimal times to interrupt and collect. Unless your app is a tiny state machine, the garbage collector trying to self-tune based on runtime behavior is going to make your performance worthlessly unpredictable.