More

paradite · 2025-08-18T05:18:43 1755494323

JSR does that? Now that might be a good reason to move my packages over to get rid of tsup.

paradite · 2025-08-09T12:44:21 1754743461

Is this from Moonshot AI (company behind the Kimi K2), or a 3rd party?

Judging from the design, I assume it's not officially related to the model.

paradite · 2025-07-30T19:00:15 1753902015

The performance not only depends on the tool, it also depends on the model, and the codebase you are working on (context), and the task given (prompt).

And all these factors are not independent. Some combinations work better than others. For example:

- Claude Sonnet 4 might work well with feature implementation, on backend code python code using Claude Code.

- Gemini 2.5 Pro works better for big fixes on frontend react codebases.

...

So you can't just test the tools alone and keep everything else constant. Instead you get a combinatorial explosion of tool * model * context * prompt to test.

16x Eval can tackle parts of the problem, but it doesn't cover factors like tools yet.

https://eval.16x.engineer/

paradite · 2025-07-30T11:25:56 1753874756

What kind of questions / domains were you encountering false information on?

kobenni · 2025-07-30T12:28:36 1753878516

Most false information was on the hardware description language VHDL that I'm currently learning.

hereme888 · 2025-07-30T16:11:12 1753891872

Ground it with text from a correct source. That's all it needs.

Anamon · 2025-07-30T17:03:09 1753894989

Then why not just use the source text directly and save yourself all the double-guessing?

paradite · 2025-07-25T15:54:11 1753458851

It's actually more complex than just input and output tokens, there are more pricing rules by various providers:

- Off-peak pricing by DeepSeek

- Batch pricing by OpenAI and Anthropic

- Context window differentiated pricing by Google and Grok

- Thinking vs non-thinking token pricing by Qwen

- Input token tiered pricing by Qwen coder

I originally posted here: https://x.com/paradite_/status/1947932450212221427

paradite · 2025-07-25T07:52:04 1753429924

I believe everyone should run their own evals on their own tasks or use cases.

Shameless plug, but I made a simple app for anyone to create their own evals locally:

https://eval.16x.engineer/

paradite · 2025-07-20T10:15:40 1753006540

This is obviously AI generated, if that matters.

And I have an AI workflow that generates much better posts than this.

Retr0id · 2025-07-20T10:22:21 1753006941

I think it's just written by someone who reads a lot of LLM output - lots of lists with bolded prefixes. Maybe there was some AI-assistance (or a lot), but I didn't get the impression that it was AI-generated as a whole.

paradite · 2025-07-20T10:28:23 1753007303

"Hard truth" and "reality check" in the same post is dead giveaway.

I read and generate hundreds of posts every month. I have to read books on writing to keep myself sane and not sound like an AI.

squigglydonut · 2025-07-20T11:05:57 1753009557

Absolutely! And you're right to think that. Here's why...

kookamamie · 2025-07-20T11:39:19 1753011559

Applogies! You're exactly right, here's how this spans out…

Retr0id · 2025-07-20T10:36:06 1753007766

True, the graphs are also wonky - the curves don't match the supposed math.

queenkjuul · 2025-07-20T13:05:59 1753016759

Yeah that was confusing to me

delis-thumbs-7e · 2025-07-20T11:56:43 1753012603

I wonder why a person from Bombay India might use AI to aid with an English language blog post…

Perhaps more interesting is whether their argument is valid and whether their math is correct.

jrexilius · 2025-07-20T11:43:50 1753011830

The thing that sucks about it is maybe his english is bad (not his native language) so he relies on LLM output for his posts. Im inclined to cut people slack for this. But the rub is that it is indistinguishable from spam/slop generated for marketing/ads/whatever.

Or it's possible that he is one of those people that _realy_ adopted LLMs into _all_ their workflow, I guess, and he thinks the output is good enough as is, because it captured his general points?

LLMs have certainly damaged trust in general internet reading now, that's for sure.

paradite · 2025-07-20T12:23:01 1753014181

I am not pro or against AI-generated posts. I was just making an observation and testing my AI classifier.

fleebee · 2025-07-20T12:26:02 1753014362

The graphs don't line up. I'm inclined to believe they were hallucinated by an LLM and the author either didn't check them or didn't care.

Judging by the other comments this is clearly low-effort AI slop.

> LLMs have certainly damaged trust in general internet reading now, that's for sure.

I hate that this is what we have to deal with now.

stavros · 2025-07-20T17:55:53 1753034153

I don't know why you do. I found the article interesting, derived value from it. I don't care if it's an LLM or a human that gave me the value. I don't see why it should matter.

fleebee · 2025-07-21T00:26:21 1753057581

It matters to me for so many reasons that I can't go over them all here. Maybe we have different priorities, and that's fine.

One reason why LLM generated text bothers me is because there's no conscious, coherent mind behind it. There's no communicative intent because language models are inherently incapable of it. When I read a blog post, I subconsciously create a mental model of the author, deduce what kind of common ground we might have and use this understanding to interpret the text. When I learn that an LLM generated a text I've read, that mental model shatters and I feel like I was lied to. It was just a machine pretending to be a human, and my time and attention could've been used to read something written by a living being.

I read blogs to learn about the thoughts of other humans. If I wanted to know what an LLM thought about the state of vibe coding, I could just ask one at any time.

paradite · 2025-07-18T12:48:50 1752842930

Agents have been a field in AI long since 1990s.

MDP, Q learning, TD, RL, PPO are basically all about agent.

What we have today is still very much the same field as it was.

paradite · 2025-07-16T05:04:32 1752642272

This is an incredibly fascinating read into how OpenAI works.

Some of the details seem rather sensitive to me.

I'm not sure if the essay is going to stay up for long, given how "secretive" OpenAI is claimed to be.

paradite · 2025-07-15T05:05:35 1752555935

I think you are confusing "I don't like it" with "It's not going to happen".

Just because you don't like it, it doesn't mean it's not going to happen.

Observe the world without prejudice. Think rationally without prejudice.

ghostofbordiga · 2025-07-15T07:47:29 1752565649

But the claim is not "it's going to happen", the claim is "it is inevitable that it will happen", which is a much more stronger claim.

mrtesthah · 2025-07-15T05:13:35 1752556415

Things “happen” in human history only because humans make them happen. If enough humans do or don’t want something to happen, then they can muster the collective power to achieve it.

The unstated corollary in this essay is that venture capital and oligarchs do not get to define our future simply because they have more money.

_carbyau_ · 2025-07-15T05:21:38 1752556898

> do not get to define our future simply because they have more money

I don't like it, but it seems that more money is exactly why they get to define our future.

mrtesthah · 2025-07-15T06:33:44 1752561224

I refer you again to the essay; it's not inevitable that those with substantially more money than us should get to dominate us and define our future. They are but a tiny minority, and if/when enough of us see that future as not going our way, we can and will collectively withdraw our consent for the social and economic rules and structures which enable those oligarchs.

_carbyau_ · 2025-07-16T01:56:16 1752630976

It is possible to have a society wide revolution.

French revolution, Iranian revolution and I'm sure a bunch of others throughout history.

Even the good revolutions are NOT NICE and the outcomes are not guaranteed.

Sabinus · 2025-07-15T05:24:18 1752557058

Would you say the industrial revolution would have been able to be stopped by enough humans not wanting to achieve it?

>The unstated premise of this essay is that venture capital and oligarchs do not get to define our future simply because they have more money.

AI would progress without them. Not as fast, but it would.

In my mind the inevitability of technological progress comes from our competition with each other and general desire do work more easily and effectively. The rate of change will increase with more resources dedicated to innovation, but people will always innovate.

mrtesthah · 2025-07-15T06:28:56 1752560936

Currently, AI is improved through concerted human effort and energy-intensive investments. Without that human interest and effort, progress in the field would slow.

But even if AI development continues unabated, nothing is forcing us to deploy AI in ways that reduce our quality of life. We have a choice over how it's used in our society because we are the ones who are building that society.

>Would you say the industrial revolution would have been able to be stopped by enough humans not wanting to achieve it?

Yes, let's start in early 1800s England: subsistence farmers were pushed off the land by the enclosure acts and, upon becoming landless, flocked to urban areas to work in factories. The resulting commodified market of mobile laborers enabled the rise of capitalism.

So let's say these pre-industrial subsistence farmers had instead chosen to identify with the working class Chartism movement of the mid-1800s and joined in a general strike against the landed classes who controlled parliament. In that case, the industrial revolution, lacking a sufficiently pliable workforce, might have been halted, or at least occurred in a more controlled way that minimized human suffering.