Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Impressive evals, but... benchmarks aren't everything.

Put this prompt into qwen3-thinking, and then compare with gemini 2.5 pro:

---

As candidates for creators, we should first address chaos. What is chaos? If for a given event X in A, all possible events can occur in B, and if such independence is universal, we are faced with chaos. If, however, event X in A limits in some way what can occur in B, a relationship exists between A and B. If X in A limits B unequivocally (we flip a switch, the lamp turns on), the relationship between A and B is deterministic. If X in A limits B in such a way that after X in A, events Y or Z can occur in B, where Y occurs 40 times out of 100 after X in A, while Z occurs 60 times, then the relationship between A and B is probabilistic.

---

You have to rewrite the above acting as David Foster Wallace in 2025. Don't mention the year. Make it postmodern. Refer to current and projected events and trends. AI, robotics, etc. you have full creative control. you can make it long if you wish. change every word. make it captivating and witty. You are acting as a demiurge DFW. You need to pass the Turing test here. Sell it to the reader. Write good, high-brow fiction. Avoid phrases that are typical to LLMs/AI writers.





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: