Hacker News new | past | comments | ask | show | jobs | submit | more heyoni's comments login

Treat it like you do customer support and automate it. Or is that not allowed?


Do you actually want automated take-downs? Isn't that what people already complain the most about on YouTube?


No. But caching takedowns is not the same at all as YouTube takedowns. It’s very very clear who owns what data and after the takedown the site owner can modify the robots.txt and move on.

So yes, takedowns here are fine.


It’s a cache. Relatively speaking, who cares


It’s a cache; it was going to expire after a TTL anyway.


That’s not good enough. You can’t put out an open source project and expect to control the community around it to that degree.


Well, it's not an Open Source project. So there's that. ;)


Actually this is open source. This is what the term refers to. What you mean is that it's not free (libre) software, which is most certainly is not. They do not respect you and your rights: https://www.gnu.org/philosophy/open-source-misses-the-point....


When people who care about open source talk about "open source", they often mean the Open Source Initiative definition. https://opensource.org/osd


sigh OSI was established specifically because they just wanted to share code and didn't care about GNU's 4 freedoms, so don't be surprised when the proponents of open source give you projects you can't even fork.


OSI was established specifically because companies got queasy over free software's quasi-religious rationale behind the "four freedoms" and wanted something that sounded a lot more pragmatic while effectively being the same thing.

Don't get facts twisted, it was never "if you happen to peek at the code, it's open"; it's the full freedom backing behind it. That's exactly what the OSD is about, which was linked int the parent, if you bothered to read it.


We're probably just talking past each other. The reason GNU comes off as religious is because they frame the whole issue in terms of human rights and moral imperatives (it is wrong to hide the source and we have the right to share). OSI prefers to frame it in terms of capabilities and transactional relationships (it's a win-win!).

The OSD is compatible with GNU licenses; I've read it before. That doesn't change the reality that by reframing the issue from rights to abilities, the OSI created the very environment where Winamp can be released as "open-source" while making forking illegal.

If you want a good look at OSI's commitment to the OSD, take a look at their Open Source AI initiative: https://opensource.org/deepdive https://opensource.org/deepdive/drafts/open-source-ai-defini...

They are directly complicit in propagating the lie of open-source AI. If you can't inspect how it was made, including the actual training the data, you don't have the ability to understand how the AI was made. The choice of the lumber is part of how a chair is made.

Without the "I have the right to know what my computer is doing", there is nothing backing point 2 of the OSD. Without "I have the right to share", there is nothing backing point 1.


I'm not surprised. In this case, however, it doesn't even meet the OSI definition because it doesn't allow distribution of modified versions.


You might want to read the WinAMP license that you're claiming is Open Source.

Though it sounds like you might be misunderstanding things on purpose. (?)


Google does it with Go, Microsoft does it with VS Code. Open source cosplay, with or without an OSI-approved license, is rampant.


If I feed your reply to an llm will it be able to tell me what you’re talking about?

/edit I think it did alright…


I tried with Llama 3.1 and it hallucinated MFU. It also completely misinterpreted multi-node scaling.

==========================================================

Let's break down the results described in the post.

Context: The post is discussing the performance of a large language model on a MI300X GPU, which is a high-performance computing (HPC) system. The model has approximately 405 billion parameters and is trained using a batch size of 16 and sequence length of 64.

Key metrics:

MFU (Million Floating-Point Operations per Second): This is a measure of the model's performance, specifically the number of floating-point operations (FLOPs) it can perform per second.

FLOPs: The number of floating-point operations required to perform a matrix multiplication, which is a fundamental operation in deep learning.

GPU performance: The MI300X GPU is capable of 1.3 petaflops (1.3 x 10^15 FLOPs) per second in bfloat16 (a 16-bit floating-point format).

Calculations:

The author provides two calculations to estimate the MFU of the model:

Initial calculation: Assuming full weight training (not LoRA), the author estimates the MFU as:

405 billion parameters

2 FLOPs per matrix multiply per parameter

3 matrix multiplies (forward, backward parameter, and backward activation)

Batch size 16

Sequence length 64

30 seconds to complete the calculation

1.3 petaflops per second per GPU

8 GPUs

The calculation yields an MFU of approximately 0.8%.

Revised calculation: After correcting the assumption to use LoRA (a technique that reduces the number of FLOPs), the author revises the calculation by removing the backward parameter pass, which is only applied to adaptor weights (much smaller than the main weights). This yields an MFU of approximately 0.53%.

Interpretation:

The results indicate that the MI300X GPU is not yet optimized for this large language model, with an MFU of only 0.53% (or 0.8% in the initial calculation). This is a relatively low performance compared to the theoretical maximum of 1.3 petaflops per second per GPU. The author notes that the biggest issues are related to scaling to multiple nodes, suggesting that the performance may improve when running on a larger cluster.

The revised calculation using LoRA reduces the MFU by about 33%, indicating that using this technique can lead to a more efficient use of the GPU resources.

==========================================================

MFU means model flops utilitization. it is a measure of efficiency from 0% to 100% . 100% means that the model is running at maximum possible efficiency, i.e. 1.3 petaflops per gpu.

==========================================================

In that case, the results indicate that the MI300X GPU is running the large language model at a relatively low efficiency, with an MFU of 0.53% (or 0.8% in the initial calculation).

This means that the model is only utilizing a tiny fraction of the GPU's maximum theoretical performance of 1.3 petaflops per second. In other words, the model is not fully utilizing the GPU's capabilities, and there is a significant amount of headroom for optimization.

To put this into perspective, an MFU of 100% would mean that the model is running at the maximum possible efficiency, using 1.3 petaflops per second per GPU. An MFU of 0.53% or 0.8% is extremely low, indicating that the model is running at a tiny fraction of its potential performance.

The author's comment that the MI300X results are "promising functionally" suggests that the model is able to run, but the low MFU indicates that there are significant opportunities for optimization and performance improvement.


MFU being maximal floating point utilization right? I used chatgpt 4o and it otherwise looks very similar to your llama result


What was their reasoning?


To keep you running on the hamster wheel.

They'd ban Mozart and Shakespeare from the app store if they could.


I don’t think that was their reasoning.

Like, Google, all these megacorps, they are bad, but we should at least argue against their actual arguments.


Their reasoning is probably security. They're working under the assumption your app takes untrusted input in some way, maybe over the network. Which isn't a bad assumption, I mean almost all apps do. Very few apps are true self-contained applications, like a calculator.

So then if there happens to be some vulnerabilities in an older Android SDK then your app is susceptible. They could patch back security but that's expensive after a while. Easier to force app makers to update their apps.


3P app developers are also complicit. Often they deliberately cut off support for old OS's and old devices, because it's "too hard" to support them or whatever. Everyone seems to be working together to keep us on the hamster wheel.


Granted, it is hard. It's a whole extra version to QA on. If it works fine, fine, but if there are consistent negative user reviews on a version with < 5% market share, it's not worth it.

We don't support old iOS versions at all. We can't source new devices on old iOS versions so we can't reliably develop or test on them.


Exactly how I feel about every new React framework. It’s strictly worse than using any other framework and every recruiter continues to ask for it.

Don’t want to speak too negative in regards to the orgs which use it but definitely wouldn’t be the best choice from an engineering perspective for a new project.

Sorry I am not a front end developer. I am a general software engineer please don’t effectively sabotage my career because Silicon Valley wants to make the entire discipline a group of hamsters learning tools which aren’t used by the largest organizations.


> "It’s strictly worse than using any other framework"

If you actually believe that, consider yourself very lucky.

React, like any FE framework, can be implemented well or implemented badly.

React benefits from a very strong (imo the strongest) ecosystem, so if you set up your tooling and patterns correctly its fantastic.

Here's my personal preference: NextJS as the backbone, RTKQ as the central data retrieval/API calls/caching management, RHF for form handling, ag-grid for data grids, and MUI as the component library (can optionally switch this to any equivalent).

If components are designed sufficiently generic and customizable, RTKQ is used to keep data fetching on component instances, and central state storage is avoided as much as possible, it's a great system. Unless you just really hate JSX syntax or something.


There may not have been any. Individual app-store reviewers can block you any time they feel like it, the guy checking your appeal is the same, and none of them have any real pressure to behave unless you have money and corporate power behind you.


Isn’t there also some Firefox AI integration that’s being tested by one dev out there? I forgot the name and wonder if it got any traction.


Think of all the accessories they could sell though to make that work! God knows Apple loves to sell accessories and marks them up quite high.


Also by downloading Firefox which comes with a safari plugin


It almost definitely does and probably always runs at a fraction of that speed while using up the battery. Even current gen phones don’t do face recognition unless you’re connected to a charger.


Except what would have been a Mac mini plugged into your router is now handheld and way less janky than needing the user to maintain two separate devices.


The phone doesn't have the memory/compute/cooling/or power budget to be a meaningful replacement for a desktop device which in turn doesn't hold a candle to the power or economy of doing it on the server side.

Herein privacy and utility are clearly in tension. The only sane thing to do is do it on the server but if you ARE going to do it locally doing it on a computer on the clients premises is the only way to actually do it because mobile devices power/compute/cooling/power budgets are so anemic whatever a synthetic benchmark says a device can do in the space of a moment or two.

I'm presuming that the ultimate strategy that a lot of players are going to actually follow is all server all the time wherein companies who are terribly concerned about privacy will rather than doing it locally do it on a server that they own.


It doesn’t need to be a desktop. The point is they make their phones overpowered because nighttime operations while on a charger need that extra juice.


The point of this subthread is directly and only in regards to AI. The original comment I replied to is thus

> The whole selling point of Apples AI stuff is that the processing for most things can be done on the phone and thus your private data won't have to leave you device

A desktop computer or server has substantially more power than an iphone even a nice one like the one which is the topic of the linked article because it has substantially more memory, power, cooling, and so forth.

I fully expect us to continue to rapidly expand what we can do with AI and such usages to require substantial resources to run wherein mobile devices will continue to be too grossly insufficient to run.


And what your phone can’t do will be offloaded to servers. Meaning your phone can do a lot and an M1 processor isn’t necessarily a waste.


A word


Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: