Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Having lived through the dot-com era, I find the AI-era slightly dispiriting because of the sheer capital cost of training models. At the start of the dot-com era, anyone could spin up an e-commerce site with relatively little infrastructure costs. Now, it seems, only the hyper-scale companies can build these AI models. Meta, Google, Microsoft, Open-AI, etc.


I’m not sure we went through the same dot-com era, but in my experience, it was extremely expensive to spin up anything. You’d have to run your own servers, buy your own T1 lines, develop with rudimentary cgi… it was a very expensive mess - just like AI today

Which gives me hope that - like the web - hardware will catch up and stuff will become more and more accessible with time


> I’m not sure we went through the same dot-com era, but in my experience, it was extremely expensive to spin up anything. You’d have to run your own servers, buy your own T1 lines, develop with rudimentary cgi… it was a very expensive mess - just like AI today

To make your own competing LLM today you need hundreds of millions of dollars, the "very expensive" of this is on a whole different level. You could afford the things you talked about on a software engineering salary, it would be a lot of money for that engineer but at least he could do it, no way anyone but a billionaire could fund a new competing LLM today.


I think the foundation models are a commodity, anyway. The bulk of the economic value, as usual, will be realized at the application layer. Building apps that use LLMs, including fine-tuning them for particular purposes, is well within reach even of indie/solo devs.

That’s why Sam Altman makes so much noise about “safety” - OpenAI would really like a government-backed monopoly position so they can charge higher rents and capture more of that value for themselves. Fortunately, I think that llama has already left the barn.


I think openai/anthropic/etc are banking on foundation models being the equivalent of the "datacenters" or AWS-equivalents of AI - there'll be PaaSes (eg replicate), and most businesses will just pay the "rent"


Only if you're creating a foundation model. The equivalent would be competing with a well-funded Amazon, back in 1999. You can compete in building LLM-powered products with much, much less money - less than a regular web app in 99


Not everything has to be AI. You can run a small business infra for MUCH less than you did back then, especially if you adjust for inflation (!).

Training AI models costs a fortune, but so far it's been just front-loading costs in hopes of a windfall. We'll see what actually happens.


Front loading costs to eventually extract rents on usage with one hell of a capital wall protecting the assets.

Its easier to spin up a business for sure -- also easier to unwind it - there not as sticky as they used to be.


If the government can stay back far enough that more than one AI company can train their models, it will end up working like steel mills - barely enough profit to pay the massive cost of capital due to competition. If the government regulates the industry into a monopoly, all bets are off. Their investors are going to push hard for shutting the door behind them so watch out.

The only question is - what tactic? I don't really know, but one trick I am aware of is "specifying to the vendor." In other words, the introduction of regulatory requirements that are at every step in the process a description of the most favored vendor's product. As the favored players add more features, potentially safety features, those features are required in new regulations, using very specific descriptions that more or less mandate that you reproduce the existing technology, to use a software engineer's term, bug-for-bug. If your product is better in some ways but worse in others, you might have a chance in the market - but to no avail, if the regulations demand exactly the advantages of the established suppliers.


Funny example, US Steel was a textbook case for a monopoly achieved privately because it wasn't regulated against.


They were on top for a while, but later fell behind because they didn't invest. There were heavy tarrifs in place to "protect" the monopoly from foreign competition.


All true, but the initial agglomeration was 100% private


If privately arising monopolies could only be kept from buying out their regulators, they'd privately break down before they became too odious... for example Google, which for years was the only remotely good search engine, is now merely one of the better search engines. If there had been a "department of consumer information safety," staffed by the best industry professionals status can buy, that might have not happened.


This is typically called a high fixed cost business, like airlines, hotels/apartments, SpaceX, etc.

The dream may be barriers to entry that allow high margins (“rents” if you prefer the prejudicial), but all too often these huge capital costs bankrupt the company and lose money for investors (see: WeWork, Magic Leap). It is high risk, high return. Which seems fair.


Nothing in the WeWork business model is inherently capital intensive. Fundamentally they just take out long-term office leases at low rates, then sublease the space to short-term tenants at higher rates. They don't really own major assets and have no significant IP.


I understand the economics concept. I'm not sure WeWork was a great example it had significant other challenges such as a self-dealing founder and, frankly, a poor long term model.

I would wager that the concept needs a bit of a refresh as historically it has referred to high capital costs for the production of a hard good though in this case there is more than just a good produced theres a fair bit of influence and power associated with the good and a ton of downstream businesses that are reliant upon it if it goes according to plan.


Agreed, and Magic Leap had its own problems. My point was just that “invest huge amounts of capital to create a moat and then monetize in the long run” Is an inherently risky strategy. Business would not work if society insisted that large, high risk investments could not product higher long term margins than less risky investments.


It's more like "disrupting the market". The problem is that it's a whole market.

Uber just now turned its first profit since 2009, and I would wager that if not for the newly found appreciation of efficiency and austerity, it would still be burning through money like a drunken socialist sailor.

Classic approach required basic math. "Here is my investment, here is what I am going to charge for rent". You actually can figure out when your investment starts paying off.

This new "model" requires tall, loud, truth-massaging founders to "charm" VCs into giving away billions, with the promise of trillions, I guess. The founders do talk about conquering the world, like, a lot.

I do not know what the WeWork investors were thinking when they expected standard real estate to "10x" their money while the tenants were drinking free beer on tap. The whole thing screamed "scam" even to a lay-person.


So far it's been pretty "democratic" - I feel in no way disadvantaged because I can't train a foundation model myself. Actually the ecosystem is a lot better than 25 years ago - there are open source (or source available) versions of basically everything you'd want to participate in modern AI/ML.


But none of those are remotely as good as GPT4 for example.


Mixtral?


Obviously not even close


I too went through the dot com era: as in when Sun Microsystems had the tag line "we are the dot in dot com".

I assure you that before Apache and Linux took over that "dot" in the .com was not cheap!

Fortunately it only really lasted maybe 1993-1997 (I think Oracle announced Linux support in 1997, and that allowed a bunch of companies to start moving off Solaris).

But it wasn't until after the 2001 crash that people started doing sharded MySQL and then NoSQL to scale databases (when you needed it back then!).

It's early. You can do LORA training now on home systems, and for $500 you can rent enough compute to do even more meaningful fine-tuning. Lets see where we are in 5 and 10 years time.

(Provided the doomers don't get LLMs banned of course!)


Another way to compete with the big tech incumbents is instead of hardware, try maths and software hacks to level the playing field! Training models is still black magic, so making it faster on the software side can solve the capital cost issue somewhat!


This kind of research is also incredibly capital intensive. You have to pay some of the smartest people around to work in it.


That's labour and human capital intensive, not capital intensive. And I don't mean this as a technically correct nitpick: in terms of economics it's more accurate to call it the exact opposite of capital intensive.


That’s a good point, I wanted to make the point that doing the research is also incredibly expensive because it requires some of the smartest people around, and the right background (and what even is that background?)


Ye not a bad point - also agree with djhn on stuff.

It's true it'll still be relatively expensive - but I would propose its relatively inexpensive if people want to make it faster, and have the drive to do it :) On the other hand, capital expenditures requires large amounts of money, which also works.

I guess some general CUDA, some maths, knowing how to code transformers from scratch, some Operating systems and hardware knowledge, and the constant drive to read new research papers + wanting to make things better.

I just think as humans, if you have drive, we can do it no matter the contraints!


Yes, I agree with the general idea that it's not easy. Yet at least to some extent it might allow people and/or nations with (some degree of, relative) lack of capital but high levels of education and innovation to benefit and catch up.


I find the market way more open and competitive than dot-com. Everyone is throwing up a chatbot or RAG solution. There are tradesmen and secretaries and infinite 19 year olds who are now able to wire together a no-code app or low-code bot and add value to real businesses. The hyper scalars are making some money but absolutely don't have this locked up. Any Groq or Mistral could wander in and eat their lunch, and we haven't really started the race yet. The next decade will be ridiculous.


Could not have said it better. Nobody has won the race yet and things are getting better. Building a foundation model is not cheap but not out of reach still for a startup.


It's not quite the same thing. A model is just one part of a product. You can spin up a product with zero infra and calling APIs hosting models.


Foundation models != application layer. The question is whether the application layer's lunch will be eaten by better foundation models.


As far as I know training is the main issue.

I don't know a lot about ML. Does anyone know if it is possible to keep training the system while it is running?

That would help a lot if you don't have the possibility to use huge training sets as a starting point.


Ads and Search engine uses a continuous incremental training to add the new relevant information.


We will probably get there, it's just going to take time for hardware supply chains to catch up. I feel it's more comparable to mainframe eras - it took time for general purpose computing to become commoditised.


Only hyper-scale companies like ATT could build the fibre; scrappy startups like Google and Amazon ate their lunch.


They are also so far profitless (unless you are nvidia) and useless. The last gasp of an industry on its last legs.


Fine-tuning is quite accessible for the average small business or hacker, though.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: