Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

OpenAI's goal is definitely not to give everyone unlimited/equal access to powerful tools like GPT-3. We've had countless jokes about the name being 'OpenAI', and perhaps it's true that it's not the best name (along with 'democratizing' AI), but I'm not sure the author is suggesting a solution here rather than just venting that things seem kind of unfair, and no one outside of OpenAI really has much control or information available such as what he asks about.

But I personally find the complaints to be understandable, especially as someone that didn't get a response for my requests for GPT-3 beta access, it felt pretty bad to watch everyone else have fun building cool things with the world's best text AI while I sat there and couldn't do anything, even if I was willing to pay for access.

Hopefully there will be other relevant players here besides just OpenAI sooner or later.



Surely the solution here is "put the model on BitTorrent, you cowards".

Like, okay, the model's big and unwieldy to run. But hardware's always getting better, and there are lots of research use-cases where it's okay if it takes ten minutes to page the model in and out of SSD while generating predictions. Plus, maybe we'd get some more discoveries in the field of efficiently running huge models.

The arguments about "safety" were PR nonsense when they were making them about GPT-2, and they're nonsense now. It's a robot that blends up Reddit posts in a food processor, it's barely more advanced than tapping the iPhone predict-next-word button over and over, it's not going to hack the Pentagon or take over the world. The only reason OpenAI has ever had to not publish their models -- and I am ashamed that this industry doesn't call them out more often on this -- is so that they can generate positive press coverage on launch day with unrefutable cherry-picked examples.


It’s beyond absurd to build a model with public, user-created data, gathered and released for free by a nonprofit, and then claim “uh, the model’s too big and unwieldy, so we have to keep things under lock and key.”

I don’t doubt that they’ll profit handsomely from this approach, but it’s the height of cynicism to engage in this kind of stuff and their statements around the practice should be taken in kind.


The author though raises concerns about both the availability (openness) of the model as well as the current ability to run it due to cost (equity / equality of access). Making the model available would still not make it equal access.

I’m not saying I agree or disagree with the openness argument, but the equality argument is separate.


If they released it, people would figure out a way to run it “equitably” within months, if not weeks.

The amount of cheap GPU access floating out there is nuts. You can spin up a GPU instance to do best-in-class ML stuff using Fast.Ai on services like Paperspace or Colab, right now, for free.


(I work at OpenAI.)

> especially as someone that didn't get a response for my requests for GPT-3 beta access

We are still working our way through the beta list — we've received tens of thousands of applications and we're trying to grow responsibly. We will definitely get to you (and everyone else who applies), but it may take some time.

We are generally prioritizing people with a specific application they'd like to build, if you email me directly (gdb@openai.com) I may be able to accelerate an invite to you.


Thanks for the response - I had assumed the beta period was soon coming to an end, so by the time I was able to have access I'd have to pay just for basic experimentation. It was hard to say specifically what I'd design since I'd have to experiment with the API first to see if the ideas I had were feasible, so I probably did a poor job at that part of the application, but appreciate the offer!


> We are generally prioritizing people with a specific application they'd like to build.

Why?


OpenAI's goals are (1) make money and (2) generate positive press coverage about OpenAI. (They make statements about wanting other things but that's mainly to help them achieve (2).)

Prioritizing people with concrete project ideas helps them in both areas: they're more likely to convert into paid customers down the line, and they're more likely to generate "OpenAI technology is now being used for X" press releases.


I think there's a fair argument that groups attempting to make a specific product are more likely to drive platform development than random individuals who just want to noodle around. This isn't to say that the more individual experimenters won't drive development too, just that when you're dealing with limited resources you do have to make some decisions about allocation.

Just framing it in terms of money and "generating positive press coverage" is a little cynical IMO. Is prioritizing any cool use cases of their technology that push the boundaries of today's technology to create real use cases besides "haha look I can make GPT3 parody VC Medium/LinkedIn articles" just press optics? I don't think so but can also understand the concern especially given this article is about democratization.


The article is hinting at this but I also think many people who complain that OpenAI didn't release the model don't understand how big this model actually is. Even if they had access to the parameters they couldn't do much with it.

Assuming you used single precision the model is 350 gigabytes (175 billion * 2 bytes). For fast inference the model needs to be in GPU memory. Most GPUs have 16GB of memory, so you would need 22 GPUs just to hold the model in memory and that doesn't even include the memory for activations.

If you wanted to do fine tuning, you would need 3x as much memory for gradients and momentum.


If it was open, there would be other services offering this, and not just an opaque beta and now a single expensive service.


I don't think it's a fair assessment as many researchers are disappointed that the model wasn't released. And I'm pretty sure they do understand the model size concerns.

Running inference on this massive model would be a really interesting challenge for people working on model compression and pruning as well as those working on low memory training. New challenges are always a good thing for research.

Personally, I just wish it was easier to get an access to their API. I have an experiment in mind that I can't wait to try.


Tell cryptocurrency miners that this is a big model to compute... the size of this problem seems very tiny.

If there are millions of ASIC, GPU, etc devices mining cryptocurrencies it is fair to speculate that democratizing AI has a special room in this model.


A one-time investment of $60,000, $200,000 in the worst case, isn't a way of dismissing the 'many people who complain the model [wasn't released]', especially given the alternative is 'being Microsoft', which costs $1,570,000,000,000.


You need tiny bit of memory for activations if you don't want fine-tuning. I think for GPT-3, fine-tuning is out of window. But it is reasonable to expect inference takes less than a minute with single 3090 and fast enough SSD.


OpenAI offers a fine-tuning api.

How did come up the one minute estimate? According to a quick google search I did, the fastest SSDs these days have a bandwidth of 3100 MB/s. So it would take 112s just to read the weights.


I don't have access to see whether they have fine-tuning API. Do you have any links explain the said fine-tuning? It is certainly surprising given there is no fine-tuning experiment mentioned in the GPT-3 paper.

Weights loading is embarrassingly simple to parallelize. Just use madam with 3 or 4 NVMe SSD sticks are sufficiently enough. You are more likely bounded by PCIe bandwidth than the SSD bandwidth. Newer NVIDIA cards with PCIe-4 support helps.


"other relevant players" -> Google could create one.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: