I had the same thought, and while this is a complete guess, it passes my sniff test personally. It’s possible that when this setting is not enabled, those wake events are not coalesced into hourly wakeups, but instead happen arbitrarily throughout the night. That would immediately lead to the behavior described.
Batching isn't mentioned anywhere. Do you have a positive reason to think this, or is it just the easiest hypothesis (besides a typo) explain what the author wrote?
Way less fun than the ISS, which is like stepping outside on a cold day in comparison. We will have colonies on the moon, Mars, and all the major moons of Jupiter before people are living in the clouds on Venus.
Probably the most interesting topic in this article is trunk stable, although it only got a few sentences. The rest could have been written by Gemini and we couldn’t tell. Trunk stable, though, is a very big deal in Android land and I don’t think many people outside of the Android team or the OEMs understand or even know about it. It is a very significant change in the Android release model and I’m very curious to see how it goes with new phones.
The scaffolding and system prompting around Claude 4 is really, really good. More importantly it’s advanced a lot in the last two months. I would definitely not make assumptions that things are equal without testing.
Yes, but also a lot of other things. It's important to direct the LLM to emphasize some embeddings vs others. This makes the chances of you getting good results exponentially higher.
But that would be supervised learning which we don't do (anymore) around here... honestly I wouldn't be surprised if the whole craze circles back to good old supervision, albeit many times empowered by what we have today on the shelves.
That would honestly be an extreme red flag for me if I was considering hiring you. As an older engineer myself, I would never roll the dice like that — my contributions in this industry are public and I have the network to show for it. My time is worth something to me, even if it’s not worth anything to anyone else, and the older I get the more that’s true.
If you trust everything the LLM tells you, and you learn from code, then yes the same exact risks apply. But this is not how you use (or should use) LLMs when you’re learning a topic. Instead you should use high quality sources, then ask the LLM to summarize them for you to start with (NotebookLM does this very well for instance, but so can others). Then you ask it to build you a study plan, with quizzes and exercises covering what you’ve learnt. Then you ask it to setup a spaced repetition worksheet that covers the topic thoroughly. At the end of this you will know the topic as well as if you’d taken a semester-long course.
One big technique it sounds like the authors of the OAuth library missed is that LLMs are very good at generating tests. A good development process for today’s coding agents is to 1) prompt with or create a PRD, 2) break this down into relatively simple tasks, 3) build a plan for how to tackle each task, with listed out conditions that should be tested, 3) write the tests, so that things are broken, TDD style and finally 4) write the implementation. The LLM can do all of this, but you can’t one-shot it these days, you have to be a human in the loop at every step, correcting when things go off track. It’s faster, but it’s not a 10x speed up like you might imagine if you think the LLM is just asynchronously taking a PRD some PM wrote and building it all. We still have jobs for a reason.
> Instead you should use high quality sources, then ask the LLM to summarize them for you to start with (NotebookLM does this very well for instance, but so can others).
How do you determine if the LLM accurately reflects what the high-quality source contains, if you haven't read the source? When learning from humans, we put trust on them to teach us based on a web-of-trust. How do you determine the level of trust with an LLM?
> When learning from humans, we put trust on them to teach us based on a web-of-trust.
But this is only part of the story. When learning from another human, you'll also actively try and devise whether they're trustworthy based on general linguistic markers, and will try to find and poke holes in what they're saying so that you can question intelligently.
This is not much different from what you'd do with an LLM, which is why it's such a problem that they're more convincing than correct pretty often. But it's not an insurmountable issue. The other issue is that their trustworthiness will wary in a different way than a human's, so you need experience to know when they're possibly just making things up. But just based on feel, I think this experience is definitely possible to gain.
Because summarizing is one of the few things LLMs are generally pretty good at. Plus you should use the summary to determine if you want to read the full source, kind of like reading an abstract for a research paper before deciding if you want to read the whole thing.
Bonus: the high quality source is going to be mostly AI written anyway
I did actually use the LLM to write tests, and was pleased to see the results, which I thought were pretty good and thorough, though clearly the author of this blog post has a different opinion.
But TDD is not the way I think. I've never been able to work that way (LLM-assisted or otherwise). I find it very hard to write tests for software that isn't implemented yet, because I always find that a lot of the details about how it should work are discovered as part of the implementation process. This both means that any API I come up with before implementing is likely to change, and also it's not clear exactly what details need to be tested until I've fully explored how the thing works.
This is just me, other people may approach things totally differently and I can certainly understand how TDD works well for some people.
You can’t project trends out endlessly. If you could, FB would have 20B users right now based on early growth (just a random guess, you get the point). The planet would have 15B people on it based on growth rate up until the 90s. Google would be bigger than the world GDP. Etc.
One of the more bullish AI people has said the models performance scales with log of compute (Sam Altman). Do you know how hard it will be to move that number? We are already well into diminishing returns with current methodologies and there is no one pointing the way to a break through that will get us to expert level performance. RLHF is underinvested in currently but will likely be the path to get us from Junior contributor to Mid in specific domains, but that still leaves a lot of room for humanity.
The most likely reason for my PoV to be wrong is that AI labs are investing a lot of training time into programming, hoping the model can self improve. I’m willing to believe that will have some payoffs in terms of cheaper, faster models and perhaps some improvements in scaling for RLHF (a huge priority for research IMO). Unsupervised RL would also be interesting, albeit with alignment concerns.
What I find unlikely with current models is that they will show truly innovative thinking, as opposed to the remixed ideas presented as “intelligence” today.
Finally, I am absolutely convinced today’s AI is already powerful enough to affect every business on the planet (yes even the plumbers). I just don’t believe they will replace us wholesale.
But this is not just an endless projection. In one sense we can't have economic growth and energy consumption go endlessly as that will eat up all the available resources on earth, there is a physical hard line.
However for AI this is not the case. There is literally an example of human level intelligence exiting in the real world. You're it. We know we haven't even scratched the limit.
It can be done because an example of the finished product is humanity itself. The question is do we have the capability to do it? And for this we don't know. Given the trend and the fact that a Finished product Already exists, It is Totally realistic to say AI will replace our jobs.
There's no evidence we're even on the right track to have human level intelligence so no, I don't think it's realistic to say that
Counterpoint: our brains use about 20 watts of power. How much does AI use again? Does this not suggest that it's absolutely nothing like what our brains do?
There is evidence we're on the right track. Are you blind? The evidence is not definitive, but it's evidence that makes it a possibility.
Evidence: ChatGPT and all LLMs.
You cannot realistically say that this isn't evidence. Neither of these things guarantees that AI will take over our jobs but they are datapoints that lend credence to the possibility that it will.
On the other side of the coin, it is utterly unrealistic to say that AI will never take over our jobs when there is Also no definitive evidence on this front.
> unrealistic to say that AI will never take over our jobs
That's not my position. I'm agnostic. I have no idea where it'll end up but there's no reason to have a strong belief either way
The comment you originally replied to is I think the sanest thing in here. You can't just project out endlessly unless you have a technological basis for it. The current methodologies are getting into diminishing returns and we'll need another breakthrough to push it much further
Then we're in agreement. It's clearly not a religous debate, you're just mischaracterizing it that way.
The original comment I replied to is categorically wrong. It's not sane at all when it's rationally and factually not true. We are not projecting endlessly. We are hitting a 1 year mark of a bumpy upward trendline that's been going for over 15 years. This 1 year mark is characterized by a bump of a slight diminishing return of LLM technology that's being over exaggerated as an absolute limit of AI.
Clearly we've had all kinds of models developed in the last 15 years so one blip is not evidence of anything.
Again we already have a datapoint here. You are a human brain, we know that an intelligence up to human intelligence can be physically realized because the human brain is ALREADY a physical realization. It is not insane to draw a projection in that direction and it is certainly not an endless growth trendline. That's false.
Given the information we have you gave it an "agnostic" outlook which is 50 50. If you asked me 10 years ago whether we would hit agi or not I would've given it a 5 percent chance, and now both of us are at 50 50. So your stance actually contradicts the "sane" statement you stated you agree with.
We are not projecting to infinite growth and you disagree with that because in your own statement you believe there is a 50 percent possibility we will hit agi.
Agnostic, at least as I was using it, was intending to mean 'who knows'. That's very different from a 50% possibility
"You are a human brain, we know that an intelligence up to human intelligence can be physically realized" - not evidence that LLMs will lead to AGI
"trendline that's been going for over 15 years" - not evidence LLMs will continue to AGI, even more so now given we're running into the limits of scaling it
AI winter is a common term for a reason. We make huge progress in a short amount of time, everyone goes crazy with hype, then it dies down for years or decades
The only evidence that justifies a specific probability is going to be technical explanations of how LLMs are going to scale to AGI. No one has that
1. LLMs are good at specific, well defined tasks with clear outcomes. The thing that got them there is hitting its limit
2. ???
3. AGI
What's the 2?
It matters.. because everyone's hyped up and saying we're all going to be replaced but they can't fill in the 2. It's a religious debate because it's blind faith without evidence
>Agnostic, at least as I was using it, was intending to mean 'who knows'. That's very different from a 50% possibility
I take "don't know" to mean the outcome is 50/50 either way because that's the default probability of "don't know"
> not evidence LLMs will continue to AGI, even more so now given we're running into the limits of scaling it
Never said it was. The human brain is evidence of what can be physically realized and that is compelling evidence that it can be built by us. It's not definitive evidence but it's compelling evidence. Fusion is less compelling because we don't have any evidence of it existing on earth.
>AI winter is a common term for a reason. We make huge progress in a short amount of time, everyone goes crazy with hype, then it dies down for years or decades
>AI winter is a common term for a reason. We make huge progress in a short amount of time, everyone goes crazy with hype, then it dies down for years or decades
AI winter refers to a singular event that happened through the entire history of AI. It is not a term applicable to a common occurrence as you seem to imply. We had one winter, and that is not enough to establish a pattern that it is going to repeat.
>1. LLMs are good at specific, well defined tasks with clear outcomes. The thing that got them there is hitting its limit
What's the thing that got them there? Training data?
>It matters.. because everyone's hyped up and saying we're all going to be replaced but they can't fill in the 2. It's a religious debate because it's blind faith without evidence
The hype is in the other direction. On HN everyone is overwhelmingly against AI and making claims that it will never happen. Also artists are already replaced. I worked at a company where artists did in fact get replaced by AI.
Full disclosure, I worked many years at Fitbit, in a very senior role.
The reason you want Whoop to have a subscription (and why I wear a Whoop now) is because it incentivizes the company to ship great hardware _and_ software. If you don’t have a subscription, your company becomes stuck on the treadmill of needing to have a new device to sell to the public every year (or even twice a year) so that you can continue to fund your business. Pebble found this out, too, and it led to their sale eventually.
Worse, the launch dates are not movable. People are largely not going to wait and buy your new watch in January if they wanted to get a Christmas gift for their spouse. The same goes for Mother’s Day. The scope is also not movable, because it has to have certain things for people to be interested vs last year’s model. We all know what happens when you fix scope and date in the iron triangle — quality suffers.
The subscription model has a great property — you can ship the device to your customers when it’s ready and meets your quality bar, and you can theoretically do it for free, because they are already paying a subscription. I realize Whoop did not take this path with their latest device release, they are clearly trying to goose their revenue for a quarter or two. That said, at the end of the day, their full product offering will either earn your subscription or not, which means that you can be confident that they are aligned with your interests. You generally cannot say that about a pure hardware company, unless they are remarkably disciplined with respect to hiring. I have not seen an example of this in the wearable/health space.
I think you can derive a general principle from this, which is that if a company is incented to sell you crap, they will eventually do so. If instead they are required to repeatedly earn your business, they will either maintain high standards for their products or they will go out of business. I mostly choose to spend my money on products made by companies with the latter model.
> you can ship the device to your customers when it’s ready and meets your quality bar, and you can theoretically do it for free, because they are already paying a subscription
Which is exactly what Whoop _didn't_ do. It seems that the subscription model did not actually work for them.
> you can be confident that they are aligned with your interests
Not at all, as this demonstrates.
> If instead they are required to repeatedly earn your business
The trick is that Whoop dropped this requirement for themselves after they got folks to sign up and before they shipped a hardware update. Presumably they think they can keep running that back---lose all their customers, get new ones who don't know, rinse, and repeat. We'll see how that works out for them.
So, I mean, I think you have some great points but it just doesn't seem to work out that way in the real world.
reply