But is this going to blow over in a few days? Again?
I can certainly appreciate frustration with the AMD stack, but be blunt, I was not impressed with Hotz's YouTube rant from before.[1] It didn't give the impression of a stable framework, and this doesn't either.
Also (at least from the end user llm inference side of things) ROCm is not nearly as unusable as it used to be. We would certainly be renting MI300s over A100s (or even H100s) if we could get any, and we use a number of different inference backends.
The post PC era of big expensive hardware you can't even buy if you do have the money is upon us, and mercy this is a scary scary time in computing for me/us.
There are some boutique hosts like Hot Aisle serving MI300s (who I really should reach out to), but for the immediate future our little startup is stuck with the big cloud providers. No MI300s for us mere mortals, not even to rent.
Besides the datacenter stuff, what exactly are people struggling to source these days? The 30/40-series prices should be fairly stable relative to the MSRP these days.
>The community is entrechend in 1.5 because that's what everyone is now familiar with, IMO
That probably has some weight to the community's decision to still use 1.5.
Other reasons (and more important IMO) why we're still stuck on 1.5 is due to nerfing 2.0, and the plethora of user trained models based on 1.5.
I'm continued to be amazed by the quality possible with 1.5. While there are pros and cons of each of the different offerings provided by other image generators, I haven't seen anything available to the public that can compete with the quality gens a competent SD prompter can produce yet.
SDXL seems to have taken off better than 2.0, but nothing so amazing to justify leaving all the 1.5 models behind.
Well, personally, SDXL just blows 1.5 out of the water for me. I haven't had a reason to even touch 1.5 in months.
But note that SDXL is really awful in automatic1111 or vanilla HF diffusers for me. You have to use something with proper augmentations (like ComfyUI or Fooocus(which runs on ComfyUI)).
>You have to use something with proper augmentations (like ComfyUI or Fooocus(which runs on ComfyUI))
Yeah, comfy was given a reference design of the sdxl model beforehand so it would be supported when sdxl was released. I should probably switch to comfy, but I don't touch the tech very frequently as I don't have a practical use case besides the coolness factor.
Ok, I'll try SDXL? But, I continue to believe that it was the botched release and attempts to push people with SD 2.x that led to whatever is being talked about in this thread for why Stability gets "so little support from the community": I lost interest in what they were working on well before they released SDXL, as I was no longer convinced that their newer stuff would be better than their older stuff due to 2.x.
FWIW, "everyone had gotten so used to 1.5 that they just didn't want to bother with 2.x" might provide a similar mechanism, if a very different place for the blame: if people aren't paying attention to the new stuff you are building, it is going to hurt your "support".
Its more like the store being a literal hedge maze, with an entrance fee, and once you get to the actual products, they are outrageously priced junk.
And the store is price fixing with nearby stores.
I think theres a difference between opportunistic crime, and reasonably upstanding people being utterly frustrated with a market status quo.
If course there is a ton of piracy that is just straight up theft from perfectly convenient platforms, but the unacceptable, anticompetitive commercial platforms are the core that keeps the community going, I think.
I do think this hits on the fundamental problem - you can't really have real competition when selling intellectual property.
With physical goods, if the store is utterly terrible, at the very worst someone can buy and resell the product at a nicer store. Intellectual property you are not allowed to do that. Everything is a vertically integrated monopoly.
Black holes are weird because they are essentially macroscopic particles with only one variable, mass (ignoring angular momentum, charge and other details for the moment). But they scale very strangely.
For instance, its interesting to see what a black hole is like that emits precisely 100W of radiation. Or that is as big as a apple, or weighs as much as a apple. Or one that lives exactly 100 years, or that is as dense as water (as black holes have the odd property of getting less 'dense' as their mass grows). Its very unintuitive, and just punching values into the calculator illustrates it perfectly.
I bring this up because the illustration (a town sized black hole sitting over New York) is a very specific configuration, and it doesn't really illustrate how oddly these objects scale.
temperature is analogous to surface gravity which gets weaker at the event horizon as the black hole grows.
temperature is not expectation of a kinetic energy distribution of micro constituents. you don't measure a black holes temperature with a thermometer because you can't get information out of a black hole
Isn't it thought that a black hole's mass is concentrated in the center in a miniscule volume, such that the mass is not distributed throughout the "volume" contained by the event horizon? I'm not understanding what density means for a black hole when it isn't known what's between the singularity and event horizon.
I think from the outside, it doesn't matter whether the mass is uniformly distributed throughout the volume or concentrated at the centre - you would feel the same force of gravity. (I'd be interested to hear if there is an experiment you could do just measuring gravity to differentiate between the two!)
Groq's inference strategy appears to be "SRAM only." There is no external memory, like GGDR or HBM. Instead, large models are split between networked cards, and the inputs/outputs and pipelined.
This is a great idea... In theory. But it seems like the implementation (IMO) missed the mark.
They are using reticle size dies, running at high TDPs, at 1 die per card, with long wires running the interconnect.
A recent microsoft paper proposed a similar strategy, but with much more economical engineering. Instead, much smaller, cheaper SRAM heavy chips would be tiled across a motherboard, with no need for a power hungry long-range interconnect, no expensive dies on expensive PCIe cards. The interconnect is physically so much shorter and lower power by virtue of being on a motherboard.
In other words, I feel that Groq took an interesting inference strategy and ignored a big part of what makes it cool, packaging them like PCIe GPUs instead of tiled accelerators. Combined with the node disadvantage and compatibility disadvantage, I'm not sure how they can avoid falling into obscurity like Graphcore, which took a very similar SRAM heavy approach.
I would point to oldschool forums (and HN!), where the communities were just large enough to moderate themselves and stop nasty off topic junk like that from appearing.
One problem is modern social media has (mostly) yanked this self-moderation power from users. And, as you said, they are too big and too cheap to hire enough mods on payroll to replace them.
Another, that I have, is that they make money from the drug dealer posts! If Facebook has to leave it up, OK, but they sure as heck shouldn't display ads or collect metrics. That should be a far more zealous check.
Yeah I really think basically all the moderation problems go away (from my very particular perspective of what the problems are and why they are problems) if you at least give users the option of fully controlling what they see (i.e only posts from users they have specifically followed etc.). But that option results in lower "engagement" and time-on-site, so social media companies will probably never go back to providing it.
Interestingly, Ollama is not popular at all in the "localllama" community (which also extends to related discords and repos).
And I think thats because of capabilities... Ollama is somewhat restrictive compared to other frontends. I have a littany of reasons I personally wouldn't run it over exui or koboldcpp, both for performance and output quality.
This is a necessity of being stable and one-click though.
As an example, they have expertise designing huge silicon chips, big iron servers, and fast interconnects. They even made a ternery chip in the past, and plenty of fabrication research. They could absolutely be a horse in the AI accelerator race.
Thats just one example of many.
...But they dont take advantage of any of that, like they are stuck in a corporate quagmire or something.
Exactly right. IBM has always been a leader in R&D, almost always 10 to 20 years ahead of anything companies are researching at that time. Their downfall has always been their implementations and their marketing/sales.
Neural chips, quantum computing, ultra high speed fiber, are just a few from the past few years, and just what we hear about. You name it and you can pretty much bet that IBM has something significantly better just sitting on the shelf, so to speak, when it comes to anything AI/server/datacenter, and plenty more.
Their patience and strategy in that area has always been impressive. Knowing you have amazing technology but there is no chance you can make money because of how difficult the thing is to manufacture, so just letting it sit until a worse version reaches scale, trading a smaller leap forward so it can actually be capitalized on is the smart move.
They do it well, but they simply cannot stop tripping themselves every time they try converting that library of advanced technology into revenue generation.
Are they actually doing good work, or is it just marketing? My only reference is Watson, which... was hyped out the wazoo... and isn't state of the art.
Good work in the R&D and strategic acquisitions? Definitely. Not that long ago they announced a significant breakthrough in scalable quantum computer architecture that will allow them to create faster and faster quantum computers. And the fact most people didn’t even notice is exactly what I mean by tripping themselves.
With respect - are you sure? Do you know this space? I don't know enough about quantum computing to be able decipher bs from legit work. I do know there have been pronouncements for a decade now and not much reality.
I can certainly appreciate frustration with the AMD stack, but be blunt, I was not impressed with Hotz's YouTube rant from before.[1] It didn't give the impression of a stable framework, and this doesn't either.
Also (at least from the end user llm inference side of things) ROCm is not nearly as unusable as it used to be. We would certainly be renting MI300s over A100s (or even H100s) if we could get any, and we use a number of different inference backends.
1: https://news.ycombinator.com/item?id=36193625