Hacker Newsnew | past | comments | ask | show | jobs | submit | bcherny's commentslogin

Fast dev tools are awesome and I am glad the TS team is thinking deeply about dev experience, as always!

One trade off is if the code for TS is no longer written in TS, that means the core team won’t be dogfooding TS day in and day out anymore, which might hurt devx in the long run. This is one of the failure modes that hurt Flow (written in OCaml), IMO. Curious how the team is thinking about this.


Hey bcherny! Yes, dog-fooding (self-hosting) has definitely been a huge part in making TypeScript's development experience as good as it is. The upside is the breadth of tests and infrastructure we've already put together to watch out for regressions. Still, to supplement this I think we will definitely be leaning a lot on developer feedback and will need to write more TypeScript that may not be in a compiler or language service codebase. :D


Interesting! This sounds like a surprisingly hard problem to me, from what I've seen of other infra teams.

Does that mean more "support rotations" for TS compiler engineers on GitHub? Are there full-stack TS apps that the TS team owns that ownership can be spread around more? Will the TS team do more rotations onto other teams at MSFT?


Ultimately the solution has to be breaking the browser monopoly on JS, via performance parity of WASM or some other route, so that developers can dogfood in performant languages instead across all their tooling, front end, and back end.


First, this thread and article have nothing to do with language and/or application execution performance. It is only about the tsc compiler execution time.

Second, JavaScript already executes quickly. Aside from arithmetic operations it has now reached performance parity to Java and highly optimized JavaScript (typed arrays and an understanding of data access from arrays and objects in memory) can come within 1.5x execution speed of C++. At this point all the slowness of JavaScript is related to things other than code execution, such as: garbage collection, unnecessary framework code bloat, and poorly written code.

That being said it isn't realistic to expect measurably significant faster execution times by replacing JavaScript with a WASM runtime. This is more true after considering that many performance problems with JavaScript in the wild are human problems more than technology problems.

Third, WASM has nothing to do with JavaScript, according to its originators and maintainers. WASM was never created to compete, replace, modify, or influence JavaScript. WASM was created as a language ubiquitous Flash replacement in a sandbox. Since WASM executes in an agnostic sandbox the cost to replace an existing runtime is high since an existing run time is already available but a WASM runtime is more akin to installing a desktop application for first time run.


How do you reconcile this view with the fact that the typescript team rewrote the compiler in Go and it got 10x faster? Do you think that they could have kept in in typescript and achieved similar performance but they didn't for some reason?


This was touched on in the video a little bit—essentially, the TypeScript codebase has a lot of polymorphic function calls, and so is generally hard to JIT optimize. JS to Go therefore yielded a direct ~3.5x improvement.

The rest of the 10x comes from multi-threading, which wasn't possible to do in a simple way in the JS compiler (efficient multithreading while writing idiomatic code is hard in JS).

JavaScript is very fast for single-threaded programs with monomorphic functions, but in the TypeScript compiler's case, the polymorphic functions and opportunity for parallelization mean that Go is substantially faster while keeping the same overall program structure.


I have no idea about the details of their test cases. If they had used an even faster language like Cobol or Fortran maybe they could have gotten it 1,000,000x faster.

What I do know is that some people complain about long compile times in their code that can last up to 10 minutes. I had a personal application that was greater than 60k lines of code and the tsc compiler would compile it in about 13 seconds on my super old computer. SWC would compile it in about 2.5 seconds. This tells me the far greater opportunity for performance improvement is not in modifying the compiler but in modifying the application instance.


> maybe they could have gotten it 1,000,000x faster.

WTF.


Yeah this is an overly exaggerated claim


It was unwarranted sarcastic snark. That commenter was bitten by some bug.


Very short, succinct and informative comment. Thank you.


Are you looking for non-browser performance such as 3d? I see no case that another language is going to bring performance to the DOM. You'd have to be rendering straight to canvas/webgl for me to believe any of this.


The issue with Flow is that it's slow, flaky and has shifted the entire paradigm multiple times making version upgrades nearly impossible without also updating your dependencies, IF your dependencies adopted the new flow version as well. Otherwise you're SOL.

As a result the amount of libraries that ship flow types has absolutely dwindled over the years, and now typescript has completely taken over.


Our experience is the opposite, we have a pretty large flow typed code base, and can do a full check in <100ms. When we converted to TS (decided not to merged) we saw typescript was in the multiple minute mark. It’s worth checking out LTI and how the typing on boundaries, enables flow to parallelize and give very precise error messages compared to TS. The third party lib support is however basically dead, except the latest versions of flow are starting to enable ingestion of TS types, so that’s interesting.


They should write a typescript-to-go transpiler (in typescript) , so that they can write their compiler in typescript and use typescript to transpile it to go.


Thanks everyone for all your questions! The team and I are signing off. Please drop any other bugs or feature requests here: https://github.com/anthropics/claude-code. Thanks and happy coding!


Hi everyone! Boris from the Claude Code team here. @eschluntz, @catherinewu, @wolffiex, @bdr and I will be around for the next hour or so and we'll do our best to answer your questions about the product.


One thing I would love to have fixed - I type in a prompt, the model produces 90% or even 100% of the answer, and then shows an error that the system is at capacity and can't produce an answer. And then the response that has already been provided is removed! Please just make it where I can still have access to the response that has been provided, even if it is incomplete.


This. Claude team, please fix this!


The UX team would never allow it. You gotta stay minimal and and definitely can't have any acknowledgement that a non-ideal user experience exists.


I'll be publishing a Firefox extension as a temporary fix, will post it here. (I don't use Chrome.)


I think tampermonkey code is a better solution?


I've made the extension, but I haven't been able to test it (hence I'd rather not release it). I use Claude daily, but I haven't bumped into the situation yet where the generated output would disappear.


Good news, I caught it today, I'll be able to iterate and at some point I'll publish my extension at Mozilla.


Yup. Its a great issue which messes like , cmon you were there at the last line.


To me it doesn’t look like a bug. I believe it is a intended “feature” pushed from high management - a dark patern to make plebs pay for answer that has overflowed the quota.


Plus one for this.


The biggest complaint I (and several others) have is that we continuously hit the limit via the UI after even just a few intensive queries. Of course, we can use the console API, but then we lose ability to have things like Projects, etc.

Do you foresee these limitations increasing anytime soon?

Quick Edit: Just wanted to also say thank you for all your hard work, Claude has been phenomenal.


We are definitely aware of this (and working on it for the web UI), and that's why Claude Code goes directly through the API!


I'm sure many of us would gladly pay more to get 3-5x the limit.

And I'm also sure that you're working on it, but some kind of auto-summarization of facts to reduce the context in order to avoid penalizing long threads would be sweet.

I don't know if your internal users are dogfooding the product that has user limits, so you may not have had this feedback - it makes me irritable/stressed to know that I'm running up close to the limit without having gotten to the bottom of a bug. I don't think stress response in your users is a desirable thing :).


This is the main point I always want to communicate to the teams building foundation models.

A lot of people just want the ability to pay more in order to get more.

I would gladly pay 10x more to get relatively modest increases in performance. That is how important the intelligence is.


As a growth company, they likely would prefer a larger amount of users even with occasional rate limits, vs smaller pool of power users.

As long as capacity is an issue, you can't have both


If people are paying for use, then why can’t you have both?


It takes time to grow capacity to meet growing revenue/usage. As parent is saying, if you are in a growth market at time T with capacity X, you would rather have more people using it even if that means they can each use less.


If you can’t scale with your customer base fire your CTO.


The problem with the API is that it, as it says in the documentation, could cost $100/hr.

I would pay $50/mo or something to be able to have reasonable use of Claude Code in a limited (but not as limited) way as through the web UI, but all of these coding tools seem to work only with the API and are therefore either too expensive or too limited.


> The problem with the API is that it, as it says in the documentation, could cost $100/hr.

I've used https://github.com/cline/cline to get a similar workflow to their Claude Code demo, and yes it's amazing how quickly the token counts add up. Claude seems to have capacity issues so I'm guessing they decided to charge a premium for what they can serve up.

+1 on the too expensive or too limited sentiment. I subscribed to Claude for quite a while but got frustrated the few times I would use it heavily I'd get stuck due to the rate limits.

I could stomach a $20-$50 subscription for something like 3.7 that I could use a lot when coding, and not worry about hitting limits (or I suspect being pushed on to a quantized/smaller model when used too much).


Claude Code does caching well fwiw. Looking my costs after a few code sessions (totaling $6 or so) the vast majority is cache read, which is great to see. Without caching it'd be wildly more expensive.

Like $5+ was cache read ($0.05/token vs $3/token) so it would have cost $300+


I haven't been able to find ClaudeCLI for pubic access yet. Would love to use.


>>> npm install -g @anthropic-ai/claude-code

>>> claude



I paid for it for a while, but I kept running out of usage limits right in the middle of work every day. I'd end up pasting the context into ChatGPT to continue. It was so frustrating, especially because I really liked it and used it a lot.

It became such an anti-pattern that I stopped paying. Now, when people ask me which one to use, I always say I like Claude more than others, but I don’t recommend using it in a professional setting.


I have substantial usage via their API using LibreChat and have never run into rate limits. Why not just use that?


That sounds more expensive than the £18/mo Claude Pro costs?


Yes, but if you want more usage it is reasonable to expect to pay more.


Same.


If you are open to alternatives, try https://glama.ai/gateway

We currently serve ~10bn tokens per day (across all models). OpenAI compatible API. No rate limits. Built in logging and tracing.

I work with LLMs every day, so I am always on top of adding models. 3.7 is also already available.

https://glama.ai/models/claude-3-7-sonnet-20250219

The gateway is integrated directly into our chat (https://glama.ai/chat). So you can use most of the things that you are used to having with Claude. And if anything is missing, just let me know and I will prioritize it. If you check our Discord, I have a decent track record of being receptive to feedback and quickly turning around features.

Long term, Glama's focus is predominantly on MCPs, but chat, gateway and LLM routing is integral to the greater vision.

I would love feedback if you are going to give a try frank@glama.ai


The issue isn't API limits, but web UI limits. We can always get around the web interface's limits by using the claude API directly but then you need to have some other interface...


The API still has limits. Even if you are on the highest tier, you will quickly run into those limits when using coding assistants.

The value proposition of Glama is that it combines UI and API.

While everyone focuses on either one or the other, I've been splitting my time equally working on both.

Glama UI would not win against Anthropic if we were to compare them by the number of features. However, the components that I developed were created with craft and love.

You have access to:

* Switch models between OpenAI/Anthropic, etc.

* Side-by-side conversations

* Full-text search of all your conversations

* Integration of LaTeX, Mermaid, rich-text editing

* Vision (uploading images)

* Response personalizations

* MCP

* Every action has a shortcut via cmd+k (ctrl+k)


Ok, but that's not the issue the parent was mentioning. I've never hit API limits but, like the original comment mentioned, I too constantly hit the web interface limits particularly when discussing relatively large modules.


Right, that's how I read it also. It's not that there's no limits with the API, but that they're appreciably different.


Your chat idea is a little similar to Abacus AI. I wish you had a similarly affordable monthly plan for chat only, but your UI seems much better. I may give it a try!


> Even if you are on the highest tier, you will quickly run into those limits when using coding assistants.

Even heavy coding sessions never run into Claude limits, and I’m nowhere near the highest tier.


I think it’s based on the tools you’re using. If I’m using Cline I don't have to try very hard to hit limits. I’m on the second tier.


Just tried it, is there a reason why the webUI is so slow?

Try to delete (close) the panel on the right on a side-by-side view. It took a good second to actually close. Creating one isn't much faster.

This is unbearably slow, to be blurt.


Who is glama.ai though? Could not find company info on the site, the Frank name writing the blog posts seems to be an alias for Popeye the sailor. Am I missing something there? How can a user vet the company?


Do you have deepseek r1 support? I need it for a current product I’m working on.


Indeed we do https://glama.ai/models/deepseek-r1

It is provided by DeepSeek and Avian.

I am also midway of enabling a third-provider (Nebius).

You can see all models/providers over at https://glama.ai/models

As another commenter in this tread said, we are just a 'frontend wrapper' around other people services. Therefore, it is not particularly difficult to add models that are already supported by other providers.

The benefit of using our wrapper is that you can use a single API key and you get one bill for all your AI bills, you don't need to hack together your own logic for routing requests between different providers, failovers, keeping track of their costs, worry what happens if a provider goes down, etc.

The market at the moment is hugely fragmented, with many providers unstable, constantly shifting prices, etc. The benefit of a router is that you don't need to worry about those things.


Yeah I am aware. I use open router at the moment but I find it lacks a good UX.


Open router is great.

They have a very solid infrastructure.

Scaling infrastructure to handle billions of tokens is no joke.

I believe they are approaching 1 trillion tokens per week.

Glama is way smaller. We only recently crossed 10bn tokens per day.

However, I have invested a lot more into UX/UI of that chat itself, i.e. while OpenRouter is entirely focused on API gateway (which is working for them), I am going for a hybrid approach.

The market is big enough for both projects to co-exist.


They are just selling a frontend wrapper on other people's services, so if someone else offers deepseek, I'm sure they will integrate it.


I see Cohere, is there any support for in-line citations like you can get with their first party API?


this is also my problem, ive only used the UI with $20 subscription, can I use the same subscription to use the cli? I'm afraid its like those aws api billing where there is no limit to how much I can use then get a surprise bill


It is API billing like AWS - you pay for what you use. Every time you exit a session we print the cost, and in the middle of a session you can do /cost to see your cost so far that session!

You can track costs in a few ways and set spend limits to avoid surprises: https://docs.anthropic.com/en/docs/agents-and-tools/claude-c...


What I really want (as a current Pro subscriber) is a subscription tier ("Ultimate" at ~$120/month ?) that gives me priority access to the usual chat interface, but _also_ a bunch of API credits that would ensure Claude and I can code together for most of the average working month (reasonable estimate would be 4 hours a day, 15 days a month).

i.e I'd like my chat and API usage to be all included under a flat-rate subscription.

Currenty Pro doesn't give me any API credits to use with coding assistants (Claude Code included ?) which is completely disjointed. And I need to be a business to use the API still ?

Honestly, Claude is so good, just please take my money and make it easy to do the above !


I don’t think you need to be a business to use the API? At least I’m fairly certain I’m using it in a personal capacity. You are never going to hit $120/month even with full-time usage (no guarantees of course, but I get to like $40/month).


Careful -- a solo dev using it professionally, meaning, coding with it as a pair coder (XP style), can easily spend $1500/week.


$1500 is 100 million output tokens, or 500 million input tokens for Claude 3.7.

The entire LOTR trilogy is ~.55 million tokens (1,200 pages, published).

If you are sending and receiving the text equivalent of several hundred copies of the LOTR trilogy every week, I don't think you are actually using AI for anything useful, or you are providing far too much context.


You can do this yourself. Anyone can buy API credits. I literally just did this with my personal credit card using my gmail based account earlier today.

1. Subscribe to Claude Pro for $20 month

2. Separately, Buy $100 worth of API credits.

Now you have a Claude "ultimate" subscription where the credits roll over as an added bonus.

As someone who only uses the APIs, and not the subscription services for AI, I can tell you that $100 is A LOT of usage. Quite frankly, I've never used anywhere close to $20 in a month which is why I don't subscribe. I mostly just use text though, so if you do a lot of image generation that can add up quickly


I don't think you can generate images with claude. just asked it for pink elephant: "I can't generate images directly, but I can create an SVG representation of a pink elephant for you." And it did it :)


That is a good idea. For something like Claude Code, $100 is not a lot, though.


You don't need to be a business to use the API.


Which is theoretically great, but if anyone can get an Aussie credit card to work, please let me know.


I haven’t had an issue with Aussie cards?

But I still hit limits, I use Claudemind with jetbrains stuff and there is a max of input tokens (j believe), I am ‘tier 2’ but doesn’t look like I can go past this without an enterprise agreement


No issue with AU credit card here. Is a credit card and not a debit card though


I use AnythingLLM so you can still have a "Projects" like RAG.


Claude is my go to llm for everything, sounds corny but it's literally expanding the circle of what I can reasonably learn, manyfold. Right now I'm attempting to read old philosophical texts (without any background in similar disciplines), and without claude's help to explain the dense language in simpler terms & discuss its ideas, give me historical contexts, explaining why it was written this or that way, compare it against newer ideas - I would've given up many times.

At work I used it many times daily in development. It's concise mode is a breath of fresh air compared to any other llm I've tried. It has helped me find bugs in foreign code bases, explain me the techstack, written bash scripts, saving me dozens of hours of work & many nerves. It generally makes me reach places I wouldn't without due to time constraints & nerves.

The only nitpick is that the service reliability is a bit worse than others, forcing me sometimes to switch to others. This is probably a hard to answer question, but are there plans to improve that?


I'm in the middle of a particularly nasty refactor of some legacy React component code (hasn't been touched in 6 years, old class based pattern, tons of methods, why, oh, why did we do XYZ) at work and have been using Aider for the last few days and have been hitting a wall. I've been digging through Aider's source code on Github to pull out prompts and try to write my own little helper script.

So, perfect timing on this release for me! I decided to install Claude Code and it is making short work of this. I love the interface. I love the personality ("Ruminating", "Schlepping", etc).

Just an all around fantastic job!

(This makes me especially bummed that I really messed up my OA awhile back for you guys. I'll try again in a few months!)

Keep on doing great work. Thank you!


Hey thanks so much! <3


Just started playing with the command-line tool. First reaction (after using it for 5 minutes): I've been using `aider` as a daily driver, with Claude 3.5, for a while now. One of the things I appreciate about aider is that it tells you how much each query cost, and what your total cost is this session. This makes it low-key easy to keep tabs on the cost of what I'm doing. Any chance you could add that to claude-code?

I'd also love to have it in a language that can be compiled, like golang or rust, but I recognize a rewrite might be more effort than it's worth. (Although maybe less with claude code to help you?)

EDIT: OK, 10 minutes in, and it seems to have major issues doing basic patches to my Golang code; the most recent thing it did was add a line with incorrect indentation, then try three times to update it with the correct indentation, getting "String to replace not found in file" each time. Aider with claude 3.5 does this really well -- not sure what the counfounding issue is here, but might be worth taking a look at their prompt & patch format to see how they do it.


If you do `/cost` it will tell you how much you've spent during that session so far.


hi! You can do /cost at any time to see what the current session has cost


One of the silver bullets of Claude, in the context of coding, is that it does NOT use RAG when you use it via the web interface. Sure, you burn your tokens but the model sees everything and this let it reply in a much better way. Is Claude Code doing the same and just doing document-level RAG, so that if a document is relevant and if it fits, all the document will be put inside the context window? I really hope so! Also, this means that splitting large code bases into manageable file sizes will make more and more sense. Another Q: is the context size of Sonnet 3.7 the same of 3.5? Btw Thanks you so much for Claude Sonnet, in the latest months it changed the way I work and I'm able to do a lot more, now.


Right -- Claude Code doesn't use RAG currently. In our testing we found that agentic search out-performed RAG for the kinds of things people use Code for.


Interesting - can you elaborate a little on what you mean by agentic search here?


Since the Claude Code docs suggest installing Ripgrep, my guess is that they mean that Claude Code often runs searches to find snippets to improve in the context.

I would argue that this is still RAG. There's a common misconception (or at least I think it's a misconception) that RAG only counts if you used vector search - I like to expand the definition of RAG to include non-vector search (like Ripgrep in this case), or any other technique where you use Retrieval techniques to Augment the Generation phase.

IR (Information Retrieval) has been around for many decades before vector search become fashionable: https://en.wikipedia.org/wiki/Information_retrieval


I agree that retrieval can take many forms besides vector search, but do we really want to call it RAG if the model is directing the search using a tool call? That like an important distinction to me and the name "agentic search" makes a lot more sense IMHO.


Yes, I think that's RAG. It's Retrieval Augmented Generation - you're retrieving content to augment the generation.

Who cares if you used vector search for the retrieval?

The best vector retrieval implementations are already switching to a hybrid between vector and FTS, because it turns out BM25 etc is still a better algorithm for a lot of use-cases.

"Agentic search" makes much less sense to me because the term "agentic" is so incredibly vague.


I think it depends who "you" is. In classic RAG the search mechanism is preordained, the search is done up front and the results handed to the model pre-baked. I'd interpret "agentic search" as anything where the model has potentially a collection of search tools that it can decide how to use best for a given query, so the search algorithm, the query, and the number of searches are all under its own control.


Exactly. Was the extra information pushed to the model as part of the query? It’s RAG. Did the model pull the extra information in via a tool call? Agentic search.


That's far clearer. Yes.


This is a really useful definition of "agentic search", thanks.


rag is an acronym with a pinned meaning now. just like the word drone. drone didnt really mean drone, but drone means drone now. no amount of complaining will fix it. :[


I guess it's what sometimes it's called "self RAG", that is, the agent looks inside the files how a human would be to find that's relevant.


As opposed to vector search, or…?


To my knowledge these are the options:

1. RAG: A simple model looks at the question, pulls up some associated data into the context and hopes that it helps.

2. Self-RAG: The model "intentionally"/agentically triggers a lookup for some topic. This can be via a traditional RAG or just string search, ie. grep.

3. Full Context: Just jam everything in the context window. The model uses its attention mechanism to pick out the parts it needs. Best but most expensive of the three, especially with repeated queries.

Aider uses kind of a hybrid of 2 and 3: you specify files that go in the context, but Aider also uses Tree-Sitter to get a map of the entire codebase, ie. function headers, class definitions etc., that is provided in full. On that basis, the model can then request additional files to be added to the context.


I'm still not sure I get the difference between 1 and 2. What is "pulls up some associated data into the context" vs ""intentionally"/agentically triggers a lookup for some topic"?


1. Tends to use embeddings with a similarity search. Sometimes called "retrieval". This is faster but similarity search doesn't alway work quite as well as you might want it to.

2. Instead lets the agent decide what to bring into context by using tools on the codebase. Since the tools used are fast enough, this gives you effectively "verified answers" so long as the agent didn't screw up its inputs to the tool (which will happen, most likely).


Does it make sense to use vector search for code? It's more for vague texts. In the code relevant parts can be found by exact name match. (in most cases. both methods aren't exclusive)


Vector search for code can be quite interesting - I've used it for things like "find me code that downloads stuff" and it's worked well. I think text search is usually better for code though.


Been a long time casual — i.e. happy to fix my code by asking questions and copy/pasting individual snippets via the chat interface. Decided to give the `claude` terminal tool a run and have to admit it looks like a fantastic tool.

Haven't tried to build a modern JS web app in years — it took the claude tool just a few minutes of prompting to convert and refactor an old clunky tool into a proper project structure, and using svelte and vite and tailwind (which I haven't built with before). Trying to learn how to even scaffold a modern app has felt daunting and this eliminates 99% of that friction.

One funny quirk: I asked it to build a test suite (I know zilch about JS testing frameworks, so it picked vitest for me) for the newly refactored app. I noticed that 3 of the 20 tests failed and so I asked it to run vitest for itself and fix the failing things. 2 minutes later, and now 7 tests were failing...

Which is very funny to me, but also not a big deal. Again, it's such a chore to research test libs and then set things up to their conventions. That the claude tool built a very usable scaffold that I can then edit and iterate on is such a huge benefit by itself, I don't need (nor desire) the AI to be complete turnkey solution.


Anthropic is back and cementing its place as the creator of the best coding models—bravo!

With Claude Code, the goal is clearly to take a slice of Cursor and its competitors' market share. I expected this to happen eventually.

The app layer has barely any moat, so any successful app with the potential to generate significant revenue will eventually be absorbed by foundation model companies in their quest for growth and profits.


I think an argument could be reasonably made that the app layer is the only moat. It’s more likely Anthropic eventually has to acquire Cursor to cement a position here than they out-compete it. Where, why, what brand and what product customers swipe their credit cards for matters — a lot.


if Claude Code offers a better experience, users will rapidly move from cursor to Claude Code.

Claude is for Code: https://medium.com/thoughts-on-machine-learning/claude-is-fo...


(1) That's a big if. It requires building a team specialized in delivering what Cursor has already delivered which is no small task. There are probably only a handful of engineers on the planet that have or can be incentivized to develop the product intuition the Cursor founders have developed in the market already. And even then; I'm an aspiring engineer / PM at Anthropic. Why would I choose to spend all of my creative energy copying what somebody else is doing for the same pay I'd get working on something greenfield, or more interesting to me, or more likely to get me a promotion?

(2) It's not clear to me that users (or developers) actually behave this way in practice. Engineering is a bit of a cargo cult. Cursor got popular because it was good but it also got popular because it got popular.


In my opinion you're vastly overestimating how much of a moat Cursor has. In broad strokes, in builds an index of your repo for easier referencing and then adds some handy UI hooks so you can talk to the model, there really isn't that much more going on. Yes, the autocomplete is nice at times, but it's at best like pair programming with a new hire. Every big player in the AI space could replicate what they've done, it's only a matter of whether they consider it worth the investment or not given how fast the whole field is moving.


If Zed gets its agentice editing mode in I’m moving away from Cursor again. I’m only with them because they currently have the best experience there. Their moat is zero, and I’d much rather use purely API models than a Cursor subscription.


Conversely, I think you're overestimating the impact of the value (or lack thereof) of technology over distribution and market timing.


> It requires building a team specialized in delivering what Cursor has already delivered which is no small task.

There are several AIDEs out there, and based on working with Cursor, VS Code, and Windsurf there doesn't seem to be much of a difference (although I like Windsurf best). What moat does Cursor have?


Just chiming in to say that AIDEs (Artificial Intelligence Development Environments, I suppose) is such a good term for these new tools imo.

It's one thing to retrofit LLMs into existing tools but I'm more curious how this new space will develop as time goes on. Already stuff like the Warp terminal is pretty useful in day to day use.

Who knows, maybe this time next year we'll see more people programming by voice input instead of typing. Something akin to Talon Voice supercharged by a local LLM hopefully.


Cursor has no models, they dont even have an editor its just vscode


And Typescript simply doesn't work for me. I have tried uninstalling extensions. It is always "Initializing". I reload windows, etc. It eventually might get there, I can't tell what's going on. At the moment, AI is not worth the trade-off of no Typescript support.


My entire company of 100+ engineers is using cursor on multiple large typescript repos with zero issues. Must be some kind of local setup issue on your end, it definitely works just fine. In fact I've seen consistently more useful / less junky results from using LLMs for code with typescript than any other language, particularly when cursor's "shadow workspace" option is enabled.


They do actually have custom models for autocomplete (which requires very low latency) and applying edits from the LLM (which turns out to require another LLM step, as they can’t reliably output perfect diffs)


I wonder if they will offer competitive request counts against Cursor. Right now, at least for me, the biggest downside to Claude is how fast I blow through the limits (Pro) and hit a wall.

At least with Cursor, I can use all "premium" 500 completions and either buy more, or be patient for throttled responses.


Reread the blog post, and I suspect Cursor will remain much more competitive on pricing! No specifics, but likely far exceeding typical Cursor costs for a typical developer. Maybe it's worth it, though? Look forward to trying.

>Claude Code consumes tokens for each interaction. Typical usage costs range from $5-10 per developer per day, but can exceed $100 per hour during intensive use.


> Reread the blog post, and I suspect Cursor will remain much more competitive on pricing!

Until Cursor burns through their funding and gives up or increases their price.


hi! I've been using Claude Code in a very complementary way to my IDE, and one of the reasons we chose the terminal is because you can open it up inside whichever IDE you want!


Why not just open source Claude Code? people have tried to reverse eng the minified version https://gist.githubusercontent.com/1rgs/e4e13ac9aba301bcec28...


Paste it into Claude and ask it to made the minified code more readable ;)

Agree the code should just be open source but there's nothing secretive that you can't extract manually.


I did! its 900% over the context window limit :D I will have to do it function by function lets see a decent project for me and claude-3.7



That repo is just there for issue reporting right now - https://github.com/anthropics/claude-code/issues - it doesn't contain the tool's source code.


There’s no source code in that repo.


Hi Boris, love working with Claude! I do have a question—is there a plan to have Claude 3.5 Sonnet (or even 3.7!) made available on ca-central-1 for Amazon Bedrock anytime soon? My company is based in Canada and we deal with customer information that is required to stay within Canada, and the most recent model from Anthropic we have available to us is Claude 3.


Concur. Models aren’t real until I can run them inside my perimeter.


A minor ChatGPT feature I miss with Claude is temporary chats. I use ChatGPT for a lot of random one-off questions and don’t want them filling up my chat history with so many conversations.


Hi and congrats on the launch!

Will check out Claude Code soon, but in the meantime one unrelated other feature request: Moving existing chats into a project. I have a number of old-ish but super-useful and valuable chats (that are superficially unrelated) that I would like to bring together in a project.


I really want to try your AI models, but "You must have a valid phone number to use Anthropic's services." is a show-stopper for me.

It's the only mainstream AI service that requests this information. After a string of security lapses by many of your competitors, I have zero faith in the ability of a "fast moving" AI-focused company to keep my PII data secure.


It's a phone number. It's probably been bought / sold a few times already. Unless you're on the level of Edward Snowden, I wouldn't worry about it. But maybe your sense of privacy is more valuable than the outcome you'd get from Claude. That's fine too.


It's my phone number... linked to my Google identity... linked to every submitted user prompt... linked to my source code.

There's also been a spate of AI companies rushing to release products and having "oops" moments where they leaked customer chats or whatever.

They're not run like a FAANG, they don't have the same security pedigree, and they generally don't have any real guarantee of privacy.

So yes, my privacy is more valuable.

Conversely: Why is my non-privacy so valuable to Anthropic? Do they plan on selling my data? Maybe not now... but when funding gets a bit tight? Do they plan on selling my information to the likes of Cambridge Analytica? Not just superficial metadata, but also an AI-summarised history of my questions?

The best thing to do would be not to ask. But they are asking.

Why?

Why only them?


It's an anti abuse method. A valid phone number will always have a cost for spammers/multi accounters to obtain in mass, but will have no cost for the desired user base (the assumption is that every worthwhile user already has a phone).

Captchas are trivially broken and you can get access to millions of residential IP addresses, but phone numbers (especially if you filter out VOIP providers) still have a cost.


Just buy a $5 burner phone number. No need to use your real one.


I pay for a number from voip.ms and use sms forwarding. Its very cheap and it works on telegram as well which seemed fairly strict at detecting most voips.


Does the fact its so ungodly expensive and highly rate limited kind of prove the modern point that AI actually uses tons of water and electricity per prompt? People are used to streaming YouTube while they sleep and it's hard to think of other web technology this intensive. OpenAI is hostile to this subject. Does Claude have plans to tackle this?


> People are used to streaming YouTube while they sleep

Youtube is used to showing them ads while they sleep


Is there / are you planning a way to set $ limits per API key? Far as I can tell the "Spend limits" are currently per-org only which seems problematic.




Hi! I’ve been using Claude for macOS and iOS coding for a while, and it’s mostly great, but it’s always using deprecated APIs, even if I instruct it not to. It will correct the mistake if I ask it to, but then in later iterations, it will sometimes switch back to using a deprecated API. It also produces a lot of code that just doesn’t compile, so a lot of time is spent fixing the made up or deprecated APIs.


Awesome to see a new Claude model - since 3.5 its been my go-to for all code related tasks.

I'd really like to use Claude Code in some of my projects vs just sharing snippets via the UI but I'm curious how might doing this from our source directory affect our IP including NDA's, trade secret protections, prior disclosure rules on (future) patents, open source licensing restrictions re: redistribution etc?

Also hi Erik! - Rob


Hi Boris et al, can you comment on increased conversation lengths or limits through the UI? I didn't see that mentioned in the blog post, but it is a continued major concern of $20/month Claude.ai users. Is this an issue that should be fixed now or still waiting on a larger deployment via Amazon or something? If not now, when can users expect the conversation length limitations will be increased?


It would be great if we could upgrade API rate limits. I've tried "contacting sales" a few times and never received a response.

edit: note that my team mostly hits rate limits using things like aider and goose. 80k input token is not enough when in a flow, and I would love to experiment with a multi-agent workflow using claude


Now that the world's gotten used to the existence of AI, any hope on removing the guardrails on Claude? I don't need it to answer "How do I make meth", but I would like to not have to social engineer my prompts. I'd like it to just write the code I asked for and not judge me on how ethical the code might be.

Eg Claude will refuse to write code to wget a website and parse the html if you ask it to scrape your ex girlfriend's Instagram profile, for ethical and tos reasons, but if you phrase the request differently, it'll happily go off and generate code that does that exact thing.

Asking it to scrape my ex girlfriend's Instagram profile is just a stand in for other times I've hit a problem where I've had to social engineer my way past those guard rails, but does having those guard rails really provide value on a professional level?


Not having headlines like "Claude Gives Stalker Instructions" has a significant value to their business I would wager.

I'm very much in favour of removing the guardrails but I understand why they're in place. The problem is attribution. You can teach yourself how to engage in all manner of dark deeds with a library or wikipedia or a search engine and some time, but any resulting public outcry is usually diffuse or targeted at the sources rather than the service. When Claude or GPT or Stable Diffusion are used to generate something judged offensive, the outcry becomes an existential threat to the provider.


How is your largest customer, Cursor, taking the news that you'll be competing directly with them?


They probably aren't thrilled, but a lot of users will prefer a UI and I doubt Anthropic has the spare cycles to make a full Cursor competitor.


Unless Cursor had agreed to an exclusivity agreement with Anthropic, Antropic was (and still is) at risk of Cursor moving to a different provider or using their middleman position to train/distill their own model that competes with Anthropic.


honestly, is this something that anthropic should be worried about? you could ask the same question from all the startups that were destroyed by OpenAI.


Anthropic is still making the shovels


Great, thanks! Could you compare this new tool to Aider?


Do you think Claude Code is "better", in terms of capabilities and token efficiency, than other tools such as Cline, Cursor, or Aider?


Claude Code is a research preview -- it's more rough, lets you see model errors directly, etc. so it's not as polished as something like Cline. Personally I use all of the above. Engineers here at Anthropic also tend to use Claude Code alongside IDEs like Cursor.


Thanks for the product! Glad to hear the (so called) "safety" is being walked back on, previously Claude has been feeling a little like it is treating me as a child, excited to try it out now.


In the console, TPM limit for 3.7 is not shown (I'm tier 4). Does it mean there is no limit, or is it just pending and is "variable" until you set it to some value?


We set the Claude Code rate limits to be usable as a daily driver. We expect hitting rate limits for synchronous usage to be uncommon. Since this is a research preview, we recommend you start small as you try the product though.


Sorry, I completely missed you're from the Code team. I was actually asking about the vanilla API. Any insights into those limits? It's still missing the TPM number in the console.


Your footnote 3 seems to imply that the low number for o1 and Grok3 is without parallelism, but I don't think it's publicly known whether they use internal parallelism? So perhaps the low number already uses parallelism, while the high number uses even more parallelism?

Also, curious if you have any intuition as to why the no-parallelism number for AIME with Claude (61.3%) is quite low (e.g., relative to R1 87.3% -- assuming it is an apples to apples comparison)?


Awesome work, Claude is amazingly good at writing code that is pretty much plug and play.

Could you speak at all about potential IDE integrations? An integration into Jetbrains IDEs would be super useful - I imagine being able to highlight a bit of code and having a plugin check the code graph to see dependencies, tests etc that might be affected by a change.

Copying and pasting code constantly is starting to seem a bit primitive.


Part of our vision is that because Claude Code is just in the terminal, you can bring it into any IDE (or server) you want! Obviously that has tradeoffs of not having a full GUI of the IDE though


Anyone know how to get access to it? Notably i'm debating purchasing for Claude Code, but being on NixOS i want to make sure i can install it first.

If this Code preview is only open to subscribers it means i have to subscribe before i can even see if the binary works for me. Hmm

edit: Oh, there's a link to "joining the preview" which points to: https://docs.anthropic.com/en/docs/agents-and-tools/claude-c...


I much prefer the standalone design to being editor integrated.


Jetbrains have an official mcp plugin


Thanks, I wasn't aware of the Model Context Protocol!

For anyone interested - you can extend Claude's functionality by allowing it to run commands via a local "MCP server" (e.g. make code commits, create files, retrieve third party library code etc).

Then when you're running Claude it asks for permission to run a specific tool inside your usual Claude UI.

https://www.anthropic.com/news/model-context-protocol

https://github.com/modelcontextprotocol/servers


Why gatekeep Claude Code, instead of releasing the code for it? It seems like a direct increase in revenue/API sales for your company.


I'm not affiliated with Anthropic, but it seems like doing this will commoditize Claude (the AIaaS). Hosted AI providers are doing all they can to move away from being interchangeable commodities; it's not good for Anthropic's revenue for users to be able to easily swap-out the backend of Cloud Code to a local Olama backend, or a cheaper hosted DeepSeek. Open sourcing Claude Code would make this option 1 or 2 forks/PRs away.


It's not hard to make, its a relatively simple CLI tool so there's no moat. Also, the minified source code is available.


> It's not hard to make, its a relatively simple CLI tool so there's no moat

There are similar open source CLI tools that predate Claude Coder. Its reasonable to assume Anthropic chose not to contribute to those projects for reasons other than complexity, and charitably Anthropic likely plans for differentiating features.

> Also, the minified source code is available

The redistribution license - or lack thereof - will be the stumbling block to directly reusing code authored by Anthropic without authorization.


What do I need to do to get unbanned? I have filled in the provided Google Docs form 3-4 times to no avail. I got banned almost immediately after joining. My best guess is that I got banned because I used a VPN. https://news.ycombinator.com/item?id=40808815


Is there a way to always accept certain commands across sessions? Specifically for things like reading or updating files I don't want to have to approve that each time I open a new repl.

Also, is there a way to switch models between 3.5-sonnet and 3.5-sonnet-thinking? Got the initial impression that the thinking model is using an excessive amount of tokens on first use.


When you are prompted to accept a bash command, we should be giving you the option to not ask again. If you're not seeing that for a specific bash command, would you mind running /bug or filing an issue on Github? https://github.com/anthropics/claude-code/issues

Thinking and not thinking is actually the same model! The model thinks automatically when you ask it to. If you don't explicitly ask it to think, it won't use thinking.


with Claude coder, how does history work? I used it with my account, ran out of credit then switched to a work account but there was no chat history or other saved context of the work that had been done. I logged back in with my account to try copy it but it was gone.


Right now no, but if you run in docker, you can use `--dangerously-skip-permissions`

Some commands could be totally fine in one context, but bad in a different i.e. pushing to master


For the pokemon benchmark, what happened after the Lt Surge gym? Did the model stall or run out of context or something similar?


A bit off topic but I wanted to let you know that anthropic is currently in violation of EU Directive 98/6/EC:

> The selling price and the unit price must be indicated in an unambiguous, easily identifiable and clearly legible manner for all products offered by traders to consumers (i.e. the final price should include value added tax and all other taxes).

I wanted to see what the annual plan would cost as it was just displaying €170+VAT, and when I clicked the upgrade button to find out (I checked everywhere on the page) then I was automatically subscribed without any confirmation and without ever seeing the final price before the transaction was completed.


You can stuff up your EU directives up your nose, like your bottle caps when you try to drink from a European bottle


The bottle caps are a joke, but how can anyone in their right mind be against transparent pricing?

You think it's acceptable that a company say the price is €170+vat and then after the transaction is complete they inform you that the actual price was €206.50?


No, not OK. In this case, the recourse in the US is simple- contact the company, and when refused a refund, cancel the charge in your credit card wit a couple of simple clicks in the app.


Hi Boris! Thank you for your work on Claude! My one pet peeve with Claude specifically, if I may: I might be working on a Svelte codebase and Claude will happily ignore that context and provide React code. I understand why, but I’d love to see much less of a deep reliance on React for front-end code generation.


When I first started using Cursor the default behavior was for Claude to make a suggestion in the chat, and if the user agreed with it, they could click apply or cut and paste the part of it they wanted to use in their larger project. Now it seems the default behavior is for Claude to start writing files to the current working directory without regard for app structure or context (e.g., config files that are defined elsewhere claude likes to create another copy of). Why change the default to this? I could be wrong but I would guess most devs would want to review changes to their repo first.


Cursor has two LLM interaction modes, chat and composer. The chat does what you described first and composer can create/edit/delete files directly. Have you checked which mode you're on? It should be a tab above your chat window.


This is a question for Cursor team.


> We’ve also improved the coding experience on Claude.ai. Our GitHub integration is now available on all Claude plans—enabling developers to connect their code repositories directly to Claude

Would love to learn a bit more about how the GitHub integration works. From https://support.anthropic.com/en/articles/10167454-using-the... it seems it’s read only.

Does Claude Code let me take a generated/edited artifact and commit it back as a PR?


The https://claude.io/ integration is read-only. Basically you OAuth with GitHub and now you can select a repository, then select files or directories within it to add to either a Claude Project or to an individual prompt.

Claude Code can run commands including "git" commands, so it can create a branch, commit code to that branch and push that branch to GitHub - at which point point you can create a PR.


hey guys! i was wondering why you chose to build Claude code via CLI when many popular choices like cursor and windsurf fork VScode. do you envision the future of Claude code to abstract away the codebase entirely?


We wanted to bring the model to people where they are without having to commit to a specific tool or radically change their workflows. We also wanted to make a way that lets people experience the model’s coding abilities as directly as possible. This has tradeoffs: it uses a lot of tokens, and is rough (eg. it shows you tool errors and model weirdness), but it also gives you a lot of power and feels pretty awesome to use.


I like this quite a bit, thank you! I prefer Helix editor and i hate the idea of running VSCode just to access some random Code assistant


It would be great to have a C# / .NET SDK available for Claude so it can be integrated into Semantic Kernel [0][1]. Are there any plans for this?

[0] https://github.com/microsoft/semantic-kernel/issues/5690#iss...

[1] https://github.com/microsoft/semantic-kernel/pull/7364


I'm curious why there are no results for the "Claude 3.7 Extended Thinking" on SWE-Bench and Agentic tool use.

Are you finding that extended thinking helps a lot when the whole problem can be posed in the prompt, but that it isn't a major benefit for agentic tasks?

It would be a bit surprising, but it would also mirror my experiences, and the benchmarks which show Claude 3.5 being better at agentic tasks and SWE tasks than all other models, despite not being a reasoning model.


Are you guys paying Claude for its assistance with your products


It would be amazing to be able to use an API key to submit prompts that use our Project Knowledge. That doesn't seem to be currently possible, right?


From the release you say: "[..] in developing our reasoning models, we’ve optimized somewhat less for math and computer science competition problems, and instead shifted focus towards real-world tasks that better reflect how businesses actually use LLMs."

Can you tell us more about the trade-offs here?

Also, are you using synthetic data for improving the responses here, or are you purely leveraging data from usage/partner's usage?


Thank you for the update!

I recently attempted to use the Google Drive integration but didn't follow through with connecting because Claude wanted access to my entire Google Drive. I understand this simplifies the user experience and reduced time to ship, but is there anyway the team can add "reduce the access scope of Google Drive integration" to your backlog. Thank you!

Also, I just caught the new Github integration. Awesome.


Small UX suggestion, but could you make submission of prompt via URL parameter work? It used to be possible via https://claude.ai/new?q={query}, but that stopped working. It works for ChatGPT, Grok, and DeepSeek. With Claude you have to go and manually click the submit button.


Who the heck is on your UX team?

WHY is a huge % of my UX filled with nothing? I would apprececiate metrics, token graphs etc

https://i.imgur.com/VlxLCwI.png

Why so much wasted space? ... >>??

https://i.imgur.com/7LlCLUf.jpeg


Did you guys ever fix the issue where if UK users wanted to use the API they have to provide a VAT number?


Love the UI so far. The experience feels very inspired by Aider, which is my current choice. Thanks!


Serious question: What advice would you give to a Computer Science student in light of these tools?


Serious answer: learn to code.

You still need to know what good code looks like to use these tools. If you go forward in your career trusting the output of LLMs without the skills to evaluate the correctness, style, functionality of that code then you will have problems.

People still write low level machine code today, despite compilers having existed for 70+ (?) years.

We'll always need full-stack humans who understand everything down to the electrons even in the age of insane automation that we're entering.


Could not agree more! I have 20+ years experience and use Cursor/Sonnet daily. It saves huge amounts of time.

But I can’t imagine this tool in the hands of someone who does not have a solid understanding of programming.

You need to understand when to push back and why. It’s like doing mini code reviews all the time. LLMs are very convincing and will happily generate garbage with the utmost authority.

Don’t trust and absolutely verify.


+1 to this. There has never been a better time to learn to code - the learning curve is being shaved down by these new LLM-based tools, and the amount of value people with programming literacy can produce is going up by an order of magnitude.

People who know both coding and LLMs will be a whole lot more attractive to hire to build software than people who just know LLMs for many years to come.


Can you just make a blog post on this explaining your thesis in detail? It's hard for me not to see non-technical "vibe coding" [0] sidelining everyone in the industry except for the most senior of senior devs/PMs.

[0] https://x.com/karpathy/status/1886192184808149383


I will give a little more pessimistic answer. If someone is right now studying CS then probably have expectation that can work with this profession for 30-40 years until retirement and this profession will still pay much more than average salary for most of devs anywhere (instead only of elite devs or those in US) and easily to find such job or easily switch employer.

I think the best period of Software Devs will be gone in few years. Knowing how how to code and fix things will be important still but more important to be also Jack-of-Many-Trades to provide more value: know a little about SEO, have a good taste of design and be able to tweak simple design, good taste how to organise code, better soft skills and managing or educating less tech-savvy stuff.

Another option is to specialise in some currently difficult subfield: robotics, ML, CUDA, rust and try to be this elite dev with expectation would have to move to SV or any such tech hub.

Best general recommendation I would give right now (especially for someone who is not from US) to someone who is currently studying is to use that a lot of time you have right now with not much responsibility to make some product that can provide you semi-passive income on a monthly basis ($5k-$10k) to drag yourself out of this rat race. Even if you not succeed or revenue stream will run out eventually you will learn those other skills that will be more important later if wanna be employed (SEO, code & design taste, marketing, soft skills).

Because most likely this window of opportunity might be only for the next few years in similar way when the best window for Mobile Apps was first ~2 years when App Store started


I would love to make "side revenue", but frankly I am awful at practical idea generation. I'm not a founder type I think, maybe a technical co-founder I guess.


The thing I would like automated is highlighting a function in my code then ask the AI to move it to a new module-file and import that new module.

I would like this to happen easily like hitting a menu or button without having to write an elaborate "prompt" every time.

Is this possible?


I think most language servers have a feature like this right?


Moving a function or class? Yes. But moving arbitrary lines of code into their own function in a new module is still a PITA, particularly when the lines of code are not consecutive.


So is moving a function or class possible? What actions you need to take to accomplish that? Thanks


This is supported natively by most IDEs today.


At least Pycharm is good at it.


Hi there. There are lots of phrases/patterns that Claude always uses when writing and it was very frustrating with 3.5. I can see with 3.7 those persist. Is there any way for me to contact you and show those so you can hopefully address them?


Any change there will be a way to copy and paste the responses into other text boxes (i.e., a new email) and not have to re-jig the formatting?

Lists, numbers, tabs, etc. are all a little time consuming... minor annoyance but thought I'd share.


Can you give some insight into how you chose the reply limit length? It seems to cut off many useful programs that are 80%-90% done and if the limit were just a little higher it would be a source of extraordinary benefit.


If you can reproduce that, would you mind reporting it with /bug?


Just tried it with claude 3.7 sonnet, here is the share: https://claude.ai/share/68db540d-a7ba-4e1f-882e-f10adf64be91 and it doesn't finish outputing the program. (It's missing the rest of the application function and the main function).

Here are steps to reproduce.

Background/environment:

ChatGPT helped me build this complete web browser in Python:

https://taonexus.com/publicfiles/feb2025/71toy-browser-with-...

It looks like this, versus the eventual goal: https://imgur.com/a/j8ZHrt1

in 1055 lines. But eventually it couldn't improve on it anymore, ChatGPT couldn't modify it at my request so that inline elements would be on the same line.

If you want to run it just download it and rename it to .py, I like Anaconda as an environment, after reading the code you can install the required libraries with:

conda install -c conda-forge requests pillow urllib3

then run the browser from the Anaconda prompt by just writing "python " followed by the name of the file.

2.

I tried to continue to improve the program with Claude, so that in-line elements would be on the same line.

I performed these reproduceable steps:

1. copied the code and pasted it into a Claude chat window with ctrl-v. This keeps it in the chat as paste.

2. Gave it the prompt "This complete web browser works but doesn't lay out inline elements inline, it puts them all on a new line, can you fix it so inline elements are inline?"

It spit out code until it hit section 8 out of 9 which is 70% of the way through and gave the error message "Claude hit the max length for a message and has paused its response. You can write Continue to keep the chat going". Screenshot:

https://imgur.com/a/oSeiA4M

So I wrote "Continue" and it stops when it is 90% of the way done.

Again it got stuck at 90% of the way done, second screenshot in the above album.

So I wrote "Continue" again.

It just gave an answer but it never finished the program. There's no app entry in the program, it completely omited the rest of the main class itself and the callback to call it, which would be like:

        def run(self):
            self.root.mainloop()
    
    ###############################################################################
    # main
    ###############################################################################
    
    if __name__=="__main__":
        sys.setrecursionlimit(10**6)
        app=ToyBrowser()
        app.run()
so it only output a half-finished program. It explained that it was finished.

I tried telling it "you didn't finish the program, output the rest of it" but doing so just got it stuck rewriting it without finishing it. Again it said it ran into the limit, again I said Continue, and again it didn't finish it.

The program itself is only 1055 lines, it should be able to output that much.


You don't want all that code in one file anyway. Have Claude write the code as several modules. You'll put each module in its own file and then you can import functions and classes from one module to another. Claude can walk you through it.


Congrats on the launch! You said its an important tool for you (Claude Code) how does this fit in with Co-Pilot, Cursor, etc. Do you/your teammates only rely on Claude Code? What do you reach for for different tasks?


Claude Code is super popular internally at Anthropic. Most engineers like to use it together with an IDE like Cursor, Windsurf, VS Code, Zed, Xcode, etc. Personally I usually start most coding tasks in Code, then move to an IDE for finishing touches.


Is there plans to add websearch function over some core websites (SO, API docs)? Competitors have it, and in my experience this provide very good grounding for coding tasks (way less API functions hallucinated).


What kind of sorcery did you use to create Claude? Honest question :)


Reticulating...


Does this actually have an 8k (or more) output context via the API?

3.5 did with a beta header but while 3.6 claimed to, it always cut its responses after 4k.

IIRC someone reported it on GH but had no reply.


Any way to parallelize tool use? When I go into a repo and ask "what's in here", I'm aiming for a summary that returns in 20 seconds.


My key got killed months ago when I tested it on a PDF, and support never got back to me so I am waiting for OpenRouter support!


Hi, what are the privacy terms for Claude Code? Is it memorizing the codebase it’s helping with? From an enterprise standpoint


Thank you to the team. Looks like a great release. Already switching existing prompts to Claude 3.7 to see the eval results :)


Thanks for this - exciting launch. Do you have examples of cool applications or demos that the HN crowd should check out?


We built Claude Code with Claude Code!


This is super cool and I hope y'all highlight it prominently!


Best demo - it's Claude Code all the way down. Claude Code === Claude Code


hi! I've been working on demos where I let Claude Code run for hours at a time on a sandboxed project: https://x.com/ErikSchluntz/status/1894104265817284770

TLDR: asking claude to speed up my code once 1.8x'd perf, but putting it in a loop telling it to make it faster for 2 hours led to a 500x speedup!


YES!! I need infinite credits for infinite Claude Code. Will try it to get Claude to do all my work.


dunno who else to tell this but my pet request for the next version of Claude is to have it say "ensure" and "You're absolutely right!" less often


I assume you had a comprehensive test suite?


Lol, good one.


>Do you have examples of cool applications or demos that the HN crowd should check out?

Not OP obviously, but I've built so many applications with Claude, here are just a few:

[1]

Mockup of Utopian infrastructure support button (this is just a mockup, the buttons don't do anything): https://claude.site/artifacts/435290a1-20c4-4b9b-8731-67f5d8...

[2]

Robot body simulation: https://claude.site/artifacts/6ffd3a73-43d6-4bdb-9e08-02901d...

[3]

15-piece slider puzzle: https://claude.site/artifacts/4504269b-69e3-4b76-823f-d55b3e...

[4]

Canada joining the U.S., checklist: https://claude.site/artifacts/6e249e38-f891-4aad-bb47-2d0c81...

[5]

Secure encryption and decryption with AES-256-GCM with password-based key derivation:

https://claude.site/artifacts/cb0ac898-e5ad-42cf-a961-3c4bf8...

(Try to decrypt this message

kFIxcBVRi2bZVGcIiQ7nnS0qZ+Y+1tlZkEtAD88MuNsfCUZcr6ujaz/mtbEDsLOquP4MZiKcGeTpBbXnwvSLLbA/a2uq4QgM7oJfnNakMmGAAtJ1UX8qzA5qMh7b5gze32S5c8OpsJ8=

With the password "Hello Hacker News!!" (without quotation marks))

[6]

Supply-demand visualizer under tariffs and subsidies: https://claude.site/artifacts/455fe568-27e5-4239-afa4-051652...

[7]

fortune cookie program: https://claude.site/artifacts/d7cfa4ae-6946-47af-b538-e6f992...

[8]

Household security training for classified household members (includes self-assessment and certificate): https://claude.site/artifacts/7754dae3-a095-4f02-b4d3-26f1a5...

[9]

public service accountability training program: https://claude.site/artifacts/b89a69fb-1e46-4b5c-9e96-2c29dd...

[10]

Nuclear non-proliferation "big brother" agent technical demonstration: https://claude.site/artifacts/555d57ba-6b0e-41a1-ad26-7c90ca...

Dating stuff:

[11]

Dating help: Interest Level Assessment Game (is she interested?) https://claude.site/artifacts/523c935c-274e-4efa-8480-1e09e9...

[12]

Dating checklist: https://claude.site/artifacts/10bf8bea-36d5-407d-908a-c1e156...


Hi Boris,

Would it be possible to bring back sonnet 2024 June?

That model was the most attentive.

Because we lost that model this release a value loss for me personally.


Seems to still be available via API as claude-3-5-sonnet-20240620


I tried signing up to use Claude about 6 months ago and ran into an error on the signup page. For some reason this completely locked me out from signing up since a phone number was tied to the login. I have submitted requests to get removed from this blacklist and heard nothing. The times I have tried to reach out on Twitter were never responded to. Has the customer support improved in the last 6 months?


You can try using it through Github Copilot? Just as a different avenue for usage.


I don't want use the product after having a bad experience. If they cannot create a sign up page without it breaking for me why would I want to use this service? Things happen and bugs can occur, but the amount of effort I have put in to resolve the issue outweighs the alternatives that I have had no issues using.


Not a question but thank you for helping make awesome software that helps us make awesome software, too :)


Can you let the API team know that the /v1/models endpoint has been broken for hours? Thanks.


Hello! Member of the API team here. We're unable to find issues with the /v1/models endpoint—can you share more details about your request? Feel free to email me at suzanne@anthropic.com. Thank you!


It always returns a Not Found error for me. Using the curl command copied directly from the docs:

$ curl https://api.anthropic.com/v1/models --header "x-api-key: $ANTHROPIC_API_KEY" --header "anthropic-version: 2023-06-01"

{"type":"error","error":{"type":"not_found_error","message":"Not found"}}

Edit: Tried creating a different API key and it works with that one. Weird.


If you can reproduce the issue with the other API key, I'd also love to debug this! Feel free to share the curl -vv output (excluding the key) with the Anthropic email address in my profile


with Claude coder, how does history work? I used it with my account, ran out of credit then switched to a work account but there was no chat history or other saved context of the work that had been done. I logged back in with my account to try copy it but it was gone.


Did you run the Aider benchmarks to get a comparison of Claude Code vs. Aider?


Any updates on web search?


When are you providing an alternative to email magic login links?


What are your thoughts on having a UI/design benchmark?


Which starter pokemon does Claude typically choose?


I'd also be interested in stats on Helix Fossil vs. Dome Fossil.


Will Claude be available on Azure?


when there are two commands in a prompt example

do A and then do B.

the model completely ignores the second task B.


will you guys allow remote work ever for engineers?


CLAUDE NUMBA ONE!!!

Congrats on the new release!


Hi @eschluntz, @catherinewu, @wolffiex, @bdr. Glad that you are so plucky and upbeat!

How do you feel about raking in millions while attempting to make us all unemployed?

How do you feel about stealing open source code and stripping the copyright?


Have you seen https://mycoder.ai? Seems quite similar. It was my own invention and it seems that you guys are thinking along similar lines - incredibly similar lines.



Nice!

It seems very very similar. I open sourced the code to MyCoder here: https://github.com/drivecore/mycoder I'll compare them. Off hand I think both CodeBuff and Claude Coder are missing the web debugging tools I added to MyCoder.


What do you do to build context?


Folks, let me tell you, AI is a big league player, it's a real winner, believe me. Nobody knows more about AI than I do, and I can tell you, it's going to be huge, just huge. The advancements we're seeing in AI are tremendous, the best, the greatest, the most fantastic. People are saying it's going to change the world, and I'm telling you, they're right, it's going to be yuge. AI is a game-changer, a real champion, and we're going to make America great again with the help of this incredible technology, mark my words.


The blog post also talks about how privacy is preserved in more concrete terms:

> These four steps are powered entirely by Claude, not by human analysts. This is part of our privacy-first design of Clio, with multiple layers to create “defense in depth.” For example, Claude is instructed to extract relevant information from conversations while omitting private details. We also have a minimum threshold for the number of unique users or conversations, so that low-frequency topics (which might be specific to individuals) aren’t inadvertently exposed. As a final check, Claude verifies that cluster summaries don’t contain any overly specific or identifying information before they’re displayed to the human user.


Also, Vinge’s A Deepness in the Sky


Ink is awesome, but it does have some rough edges. For example, want to absolutely position something on screen? Not possible. The UI also tends to flicker when you’re doing anything complicated, in a way that can be hard to debug.


This! I hit both of these issues. I tried building a solitaire game using ink. I couldn’t stack and offset the cards on top one another so they took up a lot of screen real estate. Then when I had re-renders the screen would flicker. Other than that it was a joy to use. I think I even tried flipping the solitaire board horizontal but that felt too weird.


How are you measuring productivity? And is the effect you see in A/B tests statistically significant? Both of these were challenging to do at Meta, even with many thousands of engineers —- curious what worked for you.


This article, and all the articles like it, are missing most of the puzzle.

Models don’t just compete on capability. Over the last year we’ve seen models and vendors differentiate along a number of lines in addition to capability:

- Safety

- UX

- Multi-modality

- Reliability

- Embeddability

And much more. Customers care about capability, but that’s like saying car owners care about horsepower — it’s a part of the choice but not the only piece.


One somewhat obsessive customer here: I pay for and use Claude, ChatGPT, Gemini, Perplexity, and one or two others.

The UX differences among the models are indeed becoming clearer and more important. Claude’s Artifacts and Projects are really handy as is ChatGPT’s Advanced Voice mode. Perplexity is great when I need a summary of recent events. Google isn’t charging for it yet, but NotebookLM is very useful in its own way as well.

When I test the underlying models directly, it’s hard for me to be sure which is better for my purposes. But those add-on features make a clear differentiation between the providers, and I can easily see consumers choosing one or another based on them.

I haven’t been following recent developments in the companies’ APIs, but I imagine that they are trying to differentiate themselves there as well.


To me, the vast majority of "consumers" as in B2C only care about price, specifically free. Pro and enterprise customers may be more focused on the capabilities you listed, but the B2C crowd is vastly in the free tier only space when it comes to GenAI.


You may be forgetting that ChatGPT has 10M paying customers. Not to mention everyone that pays for Claude Pro, Perplexity Pro, and so on.


The math on this doesn't work.


…go on?


I’m assuming they mean that if you multiply subscribers by the subscription fee then OpenAI still ends up losing billions per year.


Look at their total revenue numbers, look at their cheapest pricing option.

The math doesn't work for them to have 10m paying customers. It's not even close.


> OpenAI has 10M paying customers

According to who?


> OpenAI COO Says ChatGPT Passed 11 Million Paying Subscribers

https://www.theinformation.com/articles/openai-coo-says-chat...


Not OP but what is your guess? Bloomberg says 1M customers in the business plan [1].

[1] https://www.bloomberg.com/news/articles/2024-09-05/openai-hi...


That’s true — it’s hard to measure hypotheticals.

The best solution I have found for the problem of rewarding prevention, is having leaders repeatedly tell their people that prevention is important. And when performance review season comes around, folks that do preventative work can ask for quotes from people more senior that were close to the work to speak to its impact.

YMMV, and this may not work in every org. It did work reasonably well in a number of technical orgs I was in.


> it’s hard to measure hypotheticals

Once you phrase it that way, I start to think it might be literally impossible. How did Aslan put it? "We're never told what would have happened." It's basically the same problem as predicting the future, just starting from a time in the past, with even less grounding in reality. Impossible, right?

Measurement is important, but it isn't everything. If it's your only hammer, you're going to have... exactly the bad time our society is currently having, I guess.


This seems like a place where viewing things as a Bayesian instead of a frequentist could help. We only have one outcome that actually occurred, but that doesn't necessarily mean we can't reason about likelihoods of alternative scenarios. I'm not saying I think it's worth trying to be as granular as "reduced risk of an outage by 10% in Q3" or something like that, but it seems a bit too extreme to assume that we don't know _anything_ about what might have happened.

If we didn't have any intuition at all about potential risks, how could we be taking actions that we think reduce risk in the first place? We (hopefully!) don't try to numerically quantify how productive an engineer in raw numbers like "number of lines of code modified" or "number of commits merged", but that doesn't mean we treat it as impossible for experienced engineers to be able to judge the performance of their subordinates. I honestly don't see this as that much harder to measure than the type of engineering work that does already get rewarded; the reason it doesn't get measured as much is because it isn't valued as much, not the reverse.


I think that hits the nail on the head.

It’s important to supplement metrics with estimates, stories about the real world impact of the work, quotes from others close to the work. The less measurable, the more you need to lean on those other tools. Assuming 75% of reactive work can be measured and 10% of preventative work can with any degree of certainty, you’ll reach for those other tools more often for the latter.


Yeah, obviously you have to try to predict the effects of your disaster prevention measures, and if you can use quantitative methods so much the better, but that's not "measurement". Not even close, and especially not in the sense that the people who want to count lines of code yearn for.

And yes, those people are still out there. Some of them have learned their lesson about LoC in particular, but they haven't lost their craving for simple (often simplistic) metrics.

The difference with "regular" engineering is that working features are in fact quite tangible, even if their exact costs aren't. You put a ticket on the board for a button, or a bridge, and eventually you can push the button or drive on the bridge. Not so for prevented disasters. By the time a disaster becomes tangible, it's usually too late.


Yep, I don't pretend to believe that "software engineering" is actually "engineering" in the traditional sense. I still that security concerns in software engineering lies somewhere in the spectrum between "real engineering" and "impossible to meaningfully reason about".

As for the people who want to find a metric as simple as lines of code, I can't imagine we ever find something that will satisfy them, so I'm not particularly dissuaded by the idea of people looking for that not being happy with the measures that I think would be effective. To me, this is a classic case of not letting the perfect be the enemy of the good, and I think that it would be unfortunate for us as an industry if we only considered ways of measuring risk prevention that are 100% quantitative.


> As for the people who want to find a metric as simple as lines of code, I can't imagine we ever find something that will satisfy them

The problem with these people is that they're "satisfied" with the illusion of certainty, until things blow up. We agree on the fundamentals re measurement, but you have to watch out for that effect. Every metric is a chance for someone to implicitly decide that it's the ultimate truth of reality.


The site seems to have been hugged to death. Does anyone have a mirror link?


In case anyone else needs it: http://archive.today/BxSoe


Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: