Nevermind, another comment mentions AWS. You seem to pretty much run at cost then (a little less), if you GPU hours are $0.432 @ 2 cores, 32G, and a k80. Then packing a p2 instance would run you $7.2 (or $14.4, etc..) and back of the napkin math says you can pack about 15 gpu jobs in a p2.8x. So you hour cost on a job is $0.48.
Yes, we currently run on AWS! The p2.8xlarge has 8 GPUs, not 16 :) So, it would still boil down to $0.9/hr/GPU. Also, utilizing 8 GPUs concurrently is pretty difficult/inefficient for most jobs since benefits from parallelization tapers off quickly.
We are only using p2.xlarge (1 GPU) for now. Driving down the cost is really important to us. We use reserved instances, spot fleets, etc. to be at <50% of AWS pricing. Lots of interesting challenges to be solved there wrt. effective scheduling and fully utilizing resources.
We’ve been thinking about our own infrastructure. It would really drive down the cost, but obviously, comes with its own challenges :)