That seems to be a common take on HN - cloud is too expensive. I'm curious wheth...

IHLayman · on July 5, 2022

ML workloads definitely cost a lot of money. Even for a preemptible VM, A100 GPUs cost $0.88/hr/GPU. That's $624 a month for a single GPU and only the 40GB model. Want a dedicated 8 GPU machine in the cloud to do training with? That'll run you around 16 grand a month. Do that for 2 years and you may as well have bought the device. Want to do 16/24/40 GPU training? Good luck getting dedicated cloud machines with networking fast enough between them so that MPI works correctly, and prepared to give up your wallet.

Also, that's just compute. What about data? Sure cloud accepts your data cheaply, but they also charge you for egress of that data. Yes you should have your data in more than one location, but if you depend on just cloud then you need it in different AZ which costs even more money to keep in sync and available for training runs.

I think for simple workloads and renting compute for a startup, cloud definitely makes sense. But the moment you try to do some serious compute for ML workloads, good luck and hope you have deep pockets.

IMSAI8080 · on July 5, 2022

The other thing is nVidia try and sell GPUs with similar performance at two very different prices. One price for data centres and a quite different price to kids. If you do the job yourself you can often get away with using the much cheaper gamer grade cards for AI work (unless you need a lot of VRAM), whereas such as AWS can't do that and are required by nVidia to use the considerably more expensive cards. If your workload will fit on a gamer grade card there's no contest on price between an on-prem system and the cloud.

IHLayman · on July 5, 2022

That is a really good point, and the 3090s have a surprising amount of VRAM on them. For many smaller models this is sufficient. However, where I work without going into a lot of specifics, because of the size of the models, the amount of VRAM is crucial, as well as the infrastructure of the PCI lanes connected to it, the speed of the local storage, and the networking between both cards on the same node as well as between nodes.

The moment the model gets to be bigger than the size of any one GPU's VRAM, the higher by orders of magnitude of difficulty in the process of training that model.

MisterBastahrd · on July 5, 2022

A lot of that is just good old fashioned marketing.

Here's the list of ingredients for Excedrin Migraine:

Active Ingredients: Acetaminophen - 250 mg (Pain reliever), Aspirin (NSAID) - 250 mg (Pain reliever), Caffeine - 65 mg (Pain reliever aid) Inactive Ingredients: benzoic acid, carnauba wax, FD&C blue #1, hydroxypropylcellulose, hypromellose, light mineral oil, microcrystalline cellulose, polysorbate 20, povidone, propylene glycol, simethicone emulsion, sorbitan monolaurate, stearic acid, titanium dioxide

This is the list of symptoms that Excedrin Migraine claims to treat:

- migraines

And now here's the ingredients for Excedrin Extra Strength:

Active Ingredients: Acetaminophen - 250 mg, Aspirin (NSAID) - 250 mg, Caffeine - 65 mg Inactive Ingredients: benzoic acid, carnauba wax, FD&C blue #1, hydroxypropylcellulose, hypromellose, light mineral oil, microcrystalline cellulose, polysorbate 20, povidone, propylene glycol, simethicone emulsion, sorbitan monolaurate, stearic acid, titanium dioxide

This is the list of symptoms that Excedrin Extra Strength claims to treat:

- headache - toothache - a cold - arthritis - premenstrual & menstraul cramps - muscular aches

And while SOME places have normalized the prices between the two, they can be often found on shelves at two different price points.

vlovich123 · on July 5, 2022

Re data, I think egress rates are going to start disappearing over the next few years.

The part that’s always missing with these rent vs buy analyses on HN for some reason is that it’s totally ignoring the opex cost of operating your own hardware which is going to be non 0. Sure, it won’t be quite as expensive (no profit margin) but it’s not an order of magnitude. Additionally, most companies don’t run the HW 24/7 and, if they do, it’s not a level of people they want to hire to support said operations. Not just running it, but you have to invest and grow something that’s not a core competency to get economies of multiple teams loading up the HW.

If the next revolution in cloud comes in to cause companies to onsite the HW again, it’ll look like making it super easy to take spare compute and spare storage from existing companies and resell it on an open market in an easy way. Even still, I think the operational challenges of keeping all that up and running and being utilized at as close to 100% as possible and not focusing on your core business problem will be difficult because you won’t be able to compete with engineering companies that have a core competency in that space.

fennecfoxen · on July 5, 2022

> The part that’s always missing with these rent vs buy analyses on HN for some reason is that it’s totally ignoring the opex cost of operating your own hardware which is going to be non 0.

Effectively hiring, retaining, evaluating and rewarding competent staff is hard. Even at a big company the datacenter can be a really small world, which makes it hard for your best employees to grow. Things are especially hard when you don't have a tech brand to rely on for your recruiting, and the staff's expertise is far outside the company's core business, making it harder to evaluate who's good at anything.

luhn · on July 5, 2022

> Re data, I think egress rates are going to start disappearing over the next few years.

I'm not sure why you think that. AWS hasn't budged on their egress pricing for a decade (except the recent free tier expansion), despite the underlying costs dropping dramatically. GCP and Azure have similar prices.

Fact is, egress pricing is a moat. Cloud providers want to incentivize bringing data in (ingress is always free) and incentivize using it (intra-DC networking is free), but disincentivize bringing it out. If your data is stuck in AWS, that means your computation is stuck in AWS too.

vlovich123 · on July 6, 2022

Disclosure: I work on Cloudflare on R2 so I’m a bit biased on this.

I think we’re going to put real pressure for traditional object storage rates to come down. Since Cloudflare‘a entire MO is running the network with zero egress. As we expand our cloud platform it seems inevitable that you will at least have a strong zero egress choice and if we do a good job Amazon et all will inevitably be forced to get rid of egress. Matthew Prince laid out a strong case for why either scenario is good for us in a recent investor day presentation (either we cannibalize S3’s business and R2 becomes a massive profit machine for us because they refuse to budge on egress or Amazon drops egress which is an even larger opportunity for us).

Products like Cache Reserve help you migrate your data out of AWS transparently from any service (not just S3) - you just pay the egress penalty once per file.

Anyway. I’m not saying it’s going to disappear tomorrow but I find it hard to believe it’ll last another ten years.

matwood · on July 5, 2022

> totally ignoring the opex cost of operating your own hardware which is going to be non 0

Early in my career I worked at a company and we had a DC onsite. I remember the months long project to spec, purchase and migrate to a new, more powerful DB server. How much that costed in people-hours, I have no idea. I upgraded to a better DB a couple months ago by clicking a button...

Don't even get me started with ordering more SANs when we ran out of storage or the time a hurricane was coming and we had to prepare to fail over to another DC.

landemva · on July 5, 2022

>> I think egress rates are going to start disappearing over the next few years.

Compute costs generally drop over time. Do you have any data points to confirm egress will soon go to zero?

vlovich123 · on July 6, 2022

Cloudflare Bandwidth Alliance and R2. S3 felt some pressure just because of our pre launch announcement. It’ll be interesting to see how they adjust over the next couple of years.

hoppyhoppy2 · on July 5, 2022

It's probably worth remembering that a company the size of FedEx isn't going to be paying the listed prices.

ElectricalUnion · on July 5, 2022

They actually probably be paying more that the average listed prices, as Mainframe (basically a on-premise PaaS, where IBM rents you high performance, distributed and redundant hardware cluster on a pay-as-you-use manner) users are often dependent on very high reliability, high uptime and low latency.

nharada · on July 5, 2022

The ability to scale up experiments is really nice in cloud. In my experience you need to be quite large before you’re using your own GPUs at a utilization percentage that saves money while still having capacity for large one off experiments.

jandrewrogers · on July 5, 2022

There are a few different ways to run a data center, a subset of which are much less expensive than the cloud but require a level of competency that some organizations will never have. It can also be relatively pain-free when done well. Some workloads are inherently inefficient in the cloud because of the architecture.

Data center ops is ultimately a supply chain management problem, but most people don't treat it as such. That was my primary learning from doing data center ops at a few different companies. If you get the supply chain management right, and are technically competent, there can be a lot to recommend running your own data centers.

bluGill · on July 5, 2022

To a company you need to pay those costs no matter what.

If the AC breaks at 3am it neds to be fixed. It doesn't matter if you have your own HVAC people on sight 24x7, your own people on call to service it, a local HVAC service to come in, or you outsource the entire operations and so you have no idea how that is handled. In the end the important part of this story is that whatever you are doing with the AC continues to work. Different operations demand different levels of service (I doubt that anyone keeps HVAC techs on staff 24x7, but if the AC is that critical it is mandatory). The only case where the CEO is up at 3am is if the CEO is the owner of the local HVAC service company, not the CEO of the building with the problem.

Once you realize that to management the cost is outsourced no matter what the only question is do it with your own people and HR, or outsource it. There are pros and cons to both approaches, but for most companies it isn't their business and so the only reason to do it in house is they can't trust any company they hire.

mschuster91 · on July 5, 2022

The thing is, the cost for the HVAC 24x7x365 support for a datacenter will be roughly the same for a given location... but it makes a difference if it is you paying the whole bill (=you're self-hosting in your own datacenter), you are splitting the bill with a bunch of other customers indirectly (=you're self-hosting in a colo DC), or if you're splitting the bill with a shitload of other customers (=you're using some service on one of the big public cloud providers).

The downside for saving the costs is that you're losing control with every step taken away: as soon as you go into a datacenter of any kind, you simply cannot call up a HVAC company and offer them 100k in hard cash if they're showing up in the next 60 minutes and fix the issue. With a colo DC you can usually go and show up there to see if the HVAC, UPS and other systems are appropriate to your needs, but with one of the big cloud providers you have to trust their word that they are doing stuff correctly.

29083011397778 · on July 5, 2022

> so the only reason to do it in house is they can't trust any company they hire

Now I'm deeply confused. Any company hired either has a profit margin (plus enough to fund an "Oh shit" fund in case times turn bad) or will not stick around longer than a few years. At which case why not just hire people directly and cut out the other company's profit margin? Assuming you hire similar people at the same rate, using your own existing and already-paid-for HR, how is that not cheaper?

bluGill · on July 5, 2022

You need to deal with overhead. Nobody does their own HVAC in house because you rarely need them, and would have to pay to train people on that despite them not using it.

In some cases you can even get a discount. Utilities are. Big customer of tree trimming, the companies doing that work can give a great deal because the utility doesn't care that they take a week off after a storm for high profit margin consumer trimming.

SoftTalker · on July 5, 2022

Lots of places have their own HVAC techs in house, if they have enough HVAC work to justify it. Even if it's not their core line of business. They will do whatever costs less, +/- some amount of subjective "hassle factor."

bombcar · on July 5, 2022

Especially when it's "line critical" to their business, or if the person can do other things as well.

Larger hotels often have dedicated staff for things like HVAC, etc, because the importance of getting things fixed quick if possible is worth the cost of having someone onsite/available.

And you see similar things with colleges, etc; they often have a maintenance deportment that can be pretty large (though no doubt they've spun it off and brought it back in-house for the same "change is progress" reasons).

tssva · on July 5, 2022

I have dealt with a large number of retail colo providers, wholesale data center providers and corporate owned data centers across the US over the last 20 years and all of them used contractors for HVAC and electrical. I'm not saying dedicated staff never happens but it is definitely not the norm.

yibg · on July 5, 2022

Doesn't this logic apply to pretty much everything? Why hire external anything then? Why not do your own deliveries, hire your own trucks to transport goods etc?

There is a cost to taking on things that aren't part of your core business too.

29083011397778 · on July 6, 2022

All I can come up with is "Because economies of scale". I work for a transportation company, but we employ plumbers, carpenters, electricians, elevator repairmen, and many more that I'm not aware of, because we have enough locations / work to justify them. The pizza place has enough work to justify hiring a fleet of drivers, Amazon ships enough crap to justify having their own trucks (when they can't sucker another company into taking the unprofitable routes).

Similarly, Google doesn't ship enough stuff worldwide to justify drivers, insurance, trucks, jets, etc. - Fedex has the size and scale to make every package a couple cents cheaper, so it's just not worth it for Google.

The only other argument I can think of is the challenge of keeping every plate spinning, in good times and in bad. This is where your point of having a cost to take on something outside your core business comes in, but we seem to be in an era of mega-corporations - I'd expect lots of companies to snake tendrils into whatever will save them a fraction of a cent every time they have to do something.

SnowHill9902 · on July 5, 2022

Not really. Time to service depends on SLA and redundancy. If you have no redundancy your time to service must be less or equal than your SLA. If you have redundancy it can be longer.

pjdesno · on July 5, 2022

I've got some experience with a big academic data center - >1 acre floor space, >10MW, ~$100M construction cost. I've also worked for commercial companies of various sizes.

If your compute installation is big enough that payroll is a small fraction of the operating cost, then it's way cheaper than cloud. (that payroll has to include people who actually know how to build and run a huge compute installation)

The problem is that people come in integer units, you need a bunch of them to cover a bunch of different areas of expertise, and the particular ones you need are expensive. If you've got $1M worth of computers, you're almost certainly better off scrapping them and going to cloud, although the folks you're currently paying to run them might disagree. If you have $100M+ worth of machines it's a whole different ballgame; I'm not sure where the exact crossover is.

Note - that's assuming a single data center, and that you're big enough to build your own data center instead of renting colo space. If you need your machines to be geographically dispersed, you'll need to be even bigger before it's cheaper than cloud, and I'm not sure whether you'll ever hit crossover if you're renting colo space.

jeffbee · on July 5, 2022

1000% this. HN loves to talk about Dropbox. I spent most of my (short, praise God) career at Dropbox diagnosing a fleet of dodgy database servers we bought from HPE. Turns out they were full of little flecks of metal inside. Thousands of em, full of iron filings. You think that kind of thing happens when you are an AWS customer?

If you are sophisticated enough to engage an ODM, build your own facilities, and put hvac and electricians on 24-hour payroll, go on-prem. Otherwise, cloud all the way.

jrockway · on July 5, 2022

That's not quite where I would draw the line, I don't think. I used to work for an ISP and we were kind of split between AWS and on-prem. Obviously, things like terminating our customers' fiber feeds had to be on-prem, so there was no way to not have a data center (fortunately in the same building as our office). Moving our website to some server in there wouldn't have been much of a stretch to me, at the end of the day, it's just a backend for cloudflare anyway.

Like most startups, our management of the data center was pretty scrappy. Our CEO liked that kind of stuff, and we had a couple of network engineers that could be on call to fix overnight issues. It definitely wasn't a burden at the 50 employees size of company (and that includes field techs that actually installed fiber, dragged cable under the street, etc.)

We actually had some Linux servers in the datacenter. I don't know why, to be completely honest.

So overall my thought is that maybe use the cloud for your 1 person startup, but sometimes you need a datacenter only and it's not really rocket science. You're going to have downtime while someone drives to the datacenter. You're going to have downtime when us-east-1 explodes, too. To me, it's a wash.

toast0 · on July 5, 2022

I mean, you did want to manage bare metal servers, right?

AWS almost certainly gets batches of bad hardware too. And if your services are running on the bad hardware, you can't have a peek inside and find the iron filings. For servers, this is probably not too bad, there used to be articles about dealing with less enthusiastic ec2 vms since a long time, and if you experience that, you'd find a way. AWS has enough capacity that you can probably get vms running on a different batch of hardware somehow. With owned hardaware, if it was your first order of important database servers and they're all dodgy, that's a pickle; HPE probably has quick support? once you realize it's their hardware.

If your cloud provider's network is dodgy though, you get to diagnose that as a blackbox which is lots of fun. Would have loved to have access to router statistics.

There's a lot of stuff in betwren AWS and on-prem/owned datacenter, too.

marcosdumay · on July 5, 2022

> If you are sophisticated enough to engage an ODM, build your own facilities, and put hvac and electricians on 24-hour payroll, go on-prem. Otherwise, cloud all the way.

I imagine the entire sentiment of the comments is because FedEx is one that really should be sophisticated enough.

yobbo · on July 5, 2022

Not really a meaningful dichotomy.

There is a smooth curve between cloud and dedicated DCs, which has various levels of managed servers, co-location, and managed DCs. (A managed DC can be a secure room in a DC "complex" that shares all the heavy infrastructure of DCs.)

Primarily, the FedEx managers are committing the company long-term to Oracle/Microsoft platforms. Probably mostly to benefit their own careers.

Outsourcing hosting and management of DCs would have been something different, and probably healthier for FedEx and the industry.

spookthesunset · on July 5, 2022

> You think that kind of thing happens when you are an AWS customer?

You bet it does! But as the AWS customer you'd never notice because some poor ops dude in AWS-land gets to call up the vendor and bitch at them instead of you. It ain't your problem!

nix23 · on July 5, 2022

Why do you buy servers with metal flakes in it? No quality controll on your side?

jeffbee · on July 5, 2022

Are you saying that part of the expected savings from going on-prem is that you will have to disassemble equipment bought from major OEMs and examine it for microscopic metal dust?

That doesn't sound like it will save much money, honestly.

jotterbrain · on July 5, 2022

They’re saying it’s a surprise to hear that Dropbox doesn’t know what QC and order acceptance means. And it is, I agree. That you spent the time investigating it, implying those servers were in production, is a shibboleth to those of us that know what we’re doing when designing hardware usage that Dropbox doesn’t. It is, however, your self sourced report and we don’t have an idea of scale, so maybe they do and you’re just unlucky.

And no, operators don’t disassemble to perform QC. And no, I could hire an entire division of people buying servers at Best Buy, and disassembling them, and stress testing them, and all of that overhead including the fuel to drive to the store would still clock in under cloud’s profit margin depending on what you’re doing.

You’re of course entitled to develop your cloud opinion from that experience. That’s like finding a stain in a new car and swearing off internal combustion as a useful technology, though, without any awareness of how often new cars are defective.

jeffbee · on July 5, 2022

Many hardware problems do not surface at burn-in. Even at Google, the infamous "Platform A" from the paper "DRAM Errors in the Wild" was in large-scale production before they realized it was garbage.

jotterbrain · on July 5, 2022

Filings from the chassis stamper, which yours certainly were given the combination of circumstances and vendor, are present when the machine is installed. If you’re buying racks, your integrator inspects them. If you’re buying U, you do. It’s a five minute job to catch your thank-God-my-career-was-short story before the machine is even energized, which I know because I’ve caught the same thing from the same vendor twice. (It’s common; notice several comments point to it.) Why do you think QC benches have magnifiers and loupes? It’s a capital expenditure and an asset, so of course it’s rigorously inspected before the company accepts it, right? That’s not strange, is it?

You can point at Google and speak in abstracts but it doesn’t address the point being made, nor that your rationale for your extreme position on cloud isn’t as firm as you thought it was. Is Dropbox the only time you’ve worked with hardware? I’m genuinely asking because manufacturing defects can top 5% of incoming kit depending on who you’re dealing with. Google knew that when they built Platform A. The lie of cloud is that dismissing those problems is worth the margin (it ain’t; you send it back, make them refire the omelette, and eat the toast you baked into your capacity plan while you wait).

nix23 · on July 5, 2022

Are you saing you just buy some server unpack them and throw them into production.....oh man...the lost art of systemadmin, if your system is not stable (in testing) you for sure disassemble it, or send it back. How much money have you lost playing around with your unstable database? Was it more then test your servers for some weeks? Do you buy/build software and throw it into production without testing?

You can test your stuff and be still profitable henzer aws etc would make no money otherwise....you know they test their server much more (sometimes weeks/month)

yobbo · on July 5, 2022

Did they pass typical memory/reliability tests and so on?

nix23 · on July 5, 2022

Maybe in the first day's they survive it, but the flakes are 99% from the fans/bearings, that's why you test servers at max load for at least 1 week and HD's for 2-4 weeks.

But i don't think they made even a initial load-/stresstest.

Unpack it, trow it into the rack, no checking of internal plug's just nothing...pretty sure about that.

varjag · on July 5, 2022

Metal chips is squarely in the long tail of failure modes that you can't really anticipate (but of course really easy to be smug about in hindsight). It is also extremely unlikely the bearings, most likely these are from chassis frames assy not cleaned up properly.

nix23 · on July 5, 2022

I had some metaldust and it was from bearings, but op said something flakes and then microscopic particles. Particles = bearings, flakes = chassis or even stickers, but anyway just because of transport you dont trow a server into production without testing and inspection.

I am beeing smug about not testing your hardware as you do it with software....shitty testing is shitty testing, counts for software hardware firmware and everything between. Even for your diesel generator ;-)

linker3000 · on July 5, 2022

I heard tale of a banking centre that had a diesel generator installed by a local company.

Load and simulated power failure tests all passed.

Then some time later there was a total power cut and that's when they realised the generator had an electric start wired to the mains supply.

nix23 · on July 5, 2022

And there is also the true story when "someone" forgot to fill the tank after 5 years of regular monthly tests, then the real thing happened.

> had an electric start wired to the mains supply.

But that's a good one, humans being humans...but it worked every time before today ;))

mbesto · on July 6, 2022

Wait, are you saying that an org needs expertise to QC all of the the hardware they procure? How expensive is that? How easy it is to hire that type of QC?

Do you see how these costs all start to add up?

ClumsyPilot · on July 6, 2022

Well, are you saying that an org needs expertise to inspect faulty cars, like, by calling a mechanic?

Is that like too much these days for companies that owb fleets of cars? is opening a server harder than checking whars wrong with a car? like a cable comes loose and that's gane over?

mbesto · on July 6, 2022

If I procure a fleet of cars I expect none of them to be faulty...how about you?

nix23 · on July 6, 2022

>I expect none of them to be faulty

So you don't even test the car's, you just expect that the tire pressure is correct, tank is full?

Expect that something "just" works is exactly why pilots have checklist's.

Expectations are the main point for disappointments, you would never do that with software right?

jeffbee · on July 6, 2022

The point, which you seem so dedicated to avoiding, is that "in the cloud" these steps are not my problem. Inspecting a literal shipload of computers for subtle defects is a pain in the ass. Amazon does it for me. When I get on an airplane I do not personally have to run the checklists. The airline does it for me.

nix23 · on July 7, 2022

>The point, which you seem so dedicated to avoiding

Not true the point was you pay for it (cloud), or you do it yourself (but then do it right, and not like a amateur who build's his first "gaming-pc").

And if you do it yourself you can still be very much competitive vs cloud.

mbesto · on July 7, 2022

> (but then do it right, and not like a amateur who build's his first "gaming-pc").

Again, still avoiding the point, but oddly enough proving the point. You assume everyone isn't an amateur and knows how to build and maintain server hardware. Furthermore, because the market doesn't have enough talent to support all of the companies that exist, consolidating this to a few vendors who do have the expertise is what makes sense (economies of scale) and is what the market already decided.

nix23 · on July 8, 2022

>Again, still avoiding the point, but oddly enough proving the point.

Please read, that was my comment:

>>Not true the point was you pay for it (cloud), or you do it yourself

>You assume everyone isn't an amateur and knows how to build and maintain server hardware.

Yes that i assume, correct. Otherwise i would not call it "maintaining", is a amateur maintaining your car? Your software? If you have just amateur's handling your hardware it's probably better to pay a cloud-provider or pay a integrator todo that.

mbesto · on July 6, 2022

> you would never do that with software right?

Hilarious you used this as an analogy since software development shops are notorious for cutting corners when it comes to QA.

nix23 · on July 7, 2022

And that's why you have to test the software before production right? ...Hilarious indeed.

mbesto · on July 7, 2022

> you would never do that with software right?

You facetiously implied that every company fully tests software before it gets to production. Oh boy, do I have news for you...

Note the word "fully" as the variations of what gets tested is so broad, I don't even know where to start to explain this to you.

nix23 · on July 8, 2022

I never wrote "fully", but you test your software (i hope). Your just try to justify bad work-ethic.

>Oh boy, do I have news for you...

Nah it's ok, just happy that i have colleges with a much better mindset and risk-management understanding.

And i stop here, since you try to change what i really wrote.

FpUser · on July 5, 2022

>"I believe companies selling bare metal as a service are a happy compromise of cost and convenience, though."

This is what I do. I rent bare metal from Hetzner and OVH. I also have some hosting hardware right at my place. It saves me a ton of money and no I do not spend any meaningful time to administer. All done by a couple of shell scripts. I can re-create fresh service from the backup on a clean rented machine in no time.

As for cloud - if I need to run some simulation once a month on some bazillion core computer then sure. Cloud makes much sense in this particular case. I am sure there are other cases that can be cost effective. Bot for the average business I believe cloud is a waste of resources and money.

tomc1985 · on July 5, 2022

If you don't enjoy it then what were you doing working at a datacenter?

I enjoyed server admin, "back in the day", when your servers were pets and not cattle. But of course we have to make tech just as expendable as our workers, business school demands it! What if your pet server gets hit by a digital bus?!

marcosdumay · on July 5, 2022

Pet server classes is a much nicer concept anyway. I never liked the instance based personalization. Creating a machine, defining its class, and seeing it become a machine of its class is magical.

Of course, the newest idea is for creating and destroying machines automatically... that outside of the could is quite pointless but people want it anyway. I imagine seeing all that orchestration working must be even nicer than a machine autoconfiguring, but I am yet to see a place where it just works.

Nextgrid · on July 6, 2022

One argument for "cattle" servers on bare-metal is security. Being able to reset the machine to a clean, known-good state would clear any leftovers including potential malware. Having machines provisioned from images that include everything they need to run also means you don't even need to grant anyone root access (which you'd otherwise need to be able to audit so they don't leave anything malicious in there).

FredPret · on July 5, 2022

I, too, enjoy pet servers over cattle. But when important parts of modern life depends on servers, I can definitely see the rationale for cattle.

otabdeveloper4 · on July 5, 2022

Hetzner is about 10x cheaper than AWS, give or take.

missedthecue · on July 5, 2022

I love Hetzner and send them money every month, but you do get what you pay for. I don't think I'd like to run FedEx off of Hetzner.

otabdeveloper4 · on July 6, 2022

I was actually commenting on the margins that "cloud providers" charge.

If I was a manager at FedEx I'd definitely spend some resources to DYI and send some of those millions over to my direction instead.

kortilla · on July 5, 2022

> Because, personally, I'd rather retire than deal with Dell, HP, Cisco, fibers, cooling issues, physical security, hardware failling...

This isn’t really a meaningful analysis though. It’s just “when you do things in house there are things you have to do”.

It’s like saying, “I would rather retire than clean the toilets, restock the toilet paper, etc” in a discussion about whether to outsource your bathroom maintenance. Doesn’t tell you what’s cost effective.

TimTheTinker · on July 5, 2022

I'll be really curious how much change Oxide will bring to the status quo.

The promise is to be able to pay up front for a rack that will function as a highly capable VM, storage, and/or compute host, without any of the overhead that Dell, HP, and IBM bring. Just plug it in and start giving it workloads to do. All config can be done through the web-based management console or via the API, just like AWS.

Nextgrid · on July 6, 2022

> fibers, cooling issues, physical security

All of that can be handled by your colocation facility. In most cases you won't ever reach the scale where building your own DC makes sense.

> Then you still need to pay VMWare for a decent virtualization platform

Should still be cheaper than paying the AWS premium including for bandwidth, not to mention that you don't always need virtualization. If all you need the bare-metal for is a handful of machines to do a very specific task that's too expensive on AWS then running directly on the metal is an option (and leave on AWS the stuff that does require the convenience of virtualization).

> I believe companies selling bare metal as a service are a happy compromise of cost and convenience, though.

Agreed. Most companies shouldn't ever deal with hardware directly - just rent it from a provider and let them do the maintenance.

tootie · on July 5, 2022

I'm more or the less the sole decider for all tech decisions in my org (I don't have full budget authority, but I tell the budget holders what things cost). I'm 100% on board with cloud and even going further up the value chain to PaaS and SaaS. Cloud is expensive, but predictable. DevOps is very expensive and unpredictable. I can't even keep staff retained these days. Having a fixed dollar cost, even if it's high, saves not only the operations cost, but also the accounting cost and recruiting cost. And not just cost, but risk! Managed services are generally lower risk, and even if they aren't you can buy some indemnity that they'll cover some of the cost of failures.

patkai · on July 5, 2022

that part can be outsourced to eg. Hetzner