2025 is the year of agents. I’ve heard about SDR AI agents but not great things. Most “agents” sound like workflow automations that have been around forever. Anyone have an example of an “ai” agent which I understand to be intelligent that isn’t a glorified or rebranded workflow automation? Thx.
Been in BigCo land for 20 years now, and have seen the rise and fall of quite a few AI/ML/RPA etc fads.
Honestly the whole landscape seems broken and unproductive at this point.
Countless vendors, platforms, cloud environments, industry/technical jargon - all with different pricing models, SLAs, tooling, etc etc.
Getting anything usable is a challenge and most orgs spin in a never ending cycle of data integration/normalization work that produces little business value.
My advice to teams now is simplify, reduce, streamline - get to the kernel of what you think you need and protect it all costs. Most of the shiny new objects being pitched as silver bullets are just ways for other people to make money off your margin.
On the one hand you have gurus claiming that AI agents are going to all make all SaaS redundant, on the other claiming that AI isn't going to take my coding job, but I need to adapt my workflows to incorporate AI. We all need to start preparing now for the changes that AI is going to cause.
But these two claims aren't compatible. If AGI and these super agents are that bonkers amazeballs that they can replace entire SaaS companies - then there is no way I'm going to be able to adapt my workflows to compete as a programmer.
Further, if the wildest claims about AI end up proving to be true - there is simply no way to prepare. What possible adaptation to my workflow could I possibly come up with that an AI agent could not surpass? Why should I bother learning how to implement (with today's apis) some RAG setup for a SaaS customer service chatbot when presumably an AI agent is going to make that skillset redundant shortly after?
I'm going to be interviewing for frontend roles soon, and for my prep I'm just going back to basics and making sure I remember on demand all the basics css, html, js/ts - fuck the rest of this noise.
Programmers don't work in isolation. So I don't know how necessary it would be to quickly adapt your workflows to compete. If there's something that's useful to adopt, there will be a stream of blog posts, coworkers, people at user groups and what not spoon feeding what they learned to others. I don't think there's much cause for FOMO, I don't think it makes a big difference whether you start using a faster way to work a few months earlier or later than others. It can be cheaper to not jump on any hype train and potentially miss out on genuine improvements for a while, than to jump on all the hype trains and waste a lot of time on stuff that goes nowhere.
And like you said, if the wildest claims hold true, all programmers are out of a job by the end of 2026 anyway, with all other jobs following over the course of a few years. There's too many variables to predict what would happen in such a scenario, so probably best to deal with it if it happens.
So to me, your strategy checks out. I've personally invested some time into code generating and agentic tooling, but ultimately went back to Claude-as-Google-replacement. By my estimation, about a 5-10 % productivity boost compared to my workflow in 2022. The work is about the same, I just learn a bit faster.
> And like you said, if the wildest claims hold true, all programmers are our of a job by the end of 2026 anyway, with all other jobs following over the course of a few years. There's too many variables to predict what would happen in such a scenario, so probably best to deal with it if it happens.
So much this. AGI is the equivalent of a nuclear apocalypse in many ways—it's unlikely, not unlikely enough for comfort, but also totally not worth preparing for because there's basically no way to predict what preparations would actually be helpful, nor is it obvious that you'd even want to survive it if it happened.
The expected value of prepping for it isn't worth the investment, so it's better to do what most of us already do for nuclear war and pretty much pretend it won't happen.
I need an AI agent to continuously ask questions of PMs or stakeholders until the requirements are less vague. The good thing is this would be a plain english discussion which LLMs are good at. A PM can ask if something is technically feasible to some degree too. Maybe it can even break up tickets in a much better fashion too.
I’m a pm, today I built a working mockup with windsurf (golang + wails + vuejs +duckd). Windsurf uses codeium, branded as the first agentic IDE.
Your requirements will improve, not sure if in the long I still need developers to build the actual software.
The development process with windsurf is a bit like throwing a dice, hoping for a 6. A lot of trial and error, but if you check the git log, you see about 15 minutes between commit per feature request. Windsurf does a good job to summarize the entire feature request chat into a short git commit message. Every git commit reads like a user story.
How… do I find PMs like you? Literally have never worked with a single one that bothered to understand the technology they are building on top of at a deep enough level.
Maybe I just need to teach the ones I work with that it is now possible to trivially prototype many ideas without much or any coding skill.
Most PMs resist this because then they know the understanding of the requirements falls upon them at that point and this has been traditionally the role of architects, analysts, developers, other stakeholders etc and if you replace them with an LLM, well, it doesn't have the ability to be a true stakeholder in this way.
There’s just words on the webpage of genatron. Not a single screenshot or video, no example output, no customer statements. Even the technical details are very thin. Doesn’t give me a good impression of what they’re trying to sell.
As a PM, ChatGPT is great at helping me write tickets in a structured format from me just giving it a single sloppy sentence. I of course review it to make sure it’s understanding me properly. But having to explicitly write stuff like intended behaviors when submitting bugs can be really laborious, though I understand why engineers sometimes need that level of clarity (having been one myself for 15 years)
I have not seen one in production, but I did see 'agent products' sold to financial companies for compliance purposes ( sanctions, mortgage, other regs ). Fascinating stuff that got me mildly interested in MS troupe.
Not by name ( edit: and in corporate product names seem to change a lot from where I sit ) but every bigger consulting company/vendor[2] that works with banks/brokers/financial institutions right now seems to have at least some offering in that space to ride ai wave. The presentation I saw specifically from Crowe[1].
I like this distinction from automation by Bartosz Pucek:
At its core, an Agent is software that can:
Take in a task description
Break it down into steps
Execute those steps using available tools
Adapt its approach based on feedback
The key distinction from traditional automation: Agents handle variance and uncertainty by replanning rather than failing when their happy path breaks.
We are working on a project Potpie (https://github.com/potpie-ai/potpie). It's basically an open-source AI agentic platform that helps the developers to build AI Agents that truly understand your complex codebases and performed desired actions. Unlike majority of the AI Agent platforms and GenAI models, Potpie's AI Agents can understand the overall context of your codebase thoroughly
Amazing that at nearing the 50 comment mark and there only seems to be people who have successfully created tutorial examples? And some other things that could be done with more purpose specific traditional solutions. And some people showing love for the concepts. This is probably the bleakest I've seen a Ask HN thread considering this is where all the money is going. I think one stark thing that maybe isn't being addressed is that the value of the models is being completely controlled by the model creators or else there would be at least one story by now of success that doesn't involve merely making the LLM products available to customers as a middle entity.
This is what OP is explicitly not asking for—it's just a demo of a theoretical case, Temporal showing how a company that's hyped up on AI agents could use your platform to do agent-y things.
OP wants to know if anyone is actually using this stuff productively, not if anyone has tech demos. We've all seen more than enough tech demos.
This is interesting stuff, and a great stepping stone. I think the excitement around true agents will come when the AI can author the workflow pipeline, so to speak, in response to a request.
This is an area where terminology is in flux but I think of weak agents as mostly-hardcoded, eg if you wrote a flight booking bot that can converse with you about flight options then go do the booking - but you specified the APIs and workflow engine. Strong agents can self-directedly follow long range goals over long time frames, eg “run this business unit for me” or “manage my portfolio”.
Workflows = automations that use LLMs (or a sequence of LLM calls) at some point. Eg. "Classify this input and respond with JSON"
Is Siri or Google Assistant an "agent"? I would say no it's basically an LLM with function calling. Eg. "What's the weather -> uses predefined weather api"
Agents would need to be able to self integrate, which is impossible without giving them full computer use or admin permissions - which creates massive security risks nobody seems to be talking about.
The name for me is less important so much as can I have something that does my work for me. I've been starting to play with my own solutions between the 3 foundational modal companies. I've started to try to build my own stuff a bit, I think I need to learn more about apple scripting, also so far my experiments have required me to have multiple systems running to make it super easy for me.
You're all going to laugh at this stuff because it's so remedial and also clearly not agents but a couple things I've done... I won't say I really USE this stuff daily, I just play to see what I can do. I've figured out how to pass screenshots back and forth between modals (I have one computer take a screenshot every 30 minutes, and then send that screenshot to another machine, that machine is set up with a mouse hovering over the upload button on perplexity, it uploads the screenshot, and then perplexity does the work from the screenshot) An example of this that worked ok was I had chatgpt create all the themes for the social media schedule I needed to do this year, then I passed that screenshot to perplexity to do the searching on the web, and then I passed that to claud to write the tweet. This actually works ok-ish and I'm going to expand it a bit over the coming weeks I guess. Things like this are super helpful for weird hacks like that: https://github.com/BlueM/cliclick
Another thing I've found actually works pretty well is setting up two computers next to each other with ChatGPT voice mode, if you give them custom instructions to be sure to wait for the other one to be done talking, they don't interrupt each other and can get quite a bit of work done. Here is just a video of the mvp that I sent to a friend ages ago once I started playing with the idea: https://s.h4x.club/kpuzNkNL - I actually use this method of working quite often now, couple times a week at least, I find it's pretty helpful. If I knew how to put 4/5 modals together in one app and give them each custom instructions, I'd love to try building a team (if someone out there actually knows how to build this kinda stuff, I'm happy to help flesh out how the product would need to work, but I don't think it's super difficult to build at this point, I'm just not technical enough)
Just an update here, I forgot I'm supposed to have childlike wonder and it's the weekend but then I remembered...sooo.... 4hours later I now have a complete marketing department of agents, it works pretty well actually. I gave it a high level task around building a full campaign, and it is. Here is the social media manager agent off on it's own composing the tweets, the social media manager agent is build with 4 internal agents, but calls out to my hackernews agent and my google search agent when needed. It actually works super well... you can see it running here, the manager even told it to do all the tweets for the year, so I presume it's going to stop at 365 tweets, https://s.h4x.club/eDubwABJ
Going to spend the rest of the day building out the full system till I have a complete complement of agents that can do every task in the startup, heh.
I'm not exactly clear what you're asking. Where do you draw the line between "workflow automation" and "doing work"? To me it just seems like a spectrum with rapidly moving goal posts.
A decade ago, enterprises had quite a lot of roles involving essentially moving data from one ERP screen to another. From what I'm seeing, these roles seem to be quickly disappearing, with a combination of proper API-based automation, GUI automation and most recently LLM "agents" in crucial steps.
And on a very different note, I as a developer could ask an AI tool such as Aider or Windsurf to perform a big refactoring or other code change, working autonomously across code changes and shell commands until it passes all tests - this is agentic behavior that I didn't have even a year ago.
- ICP / Sales Agent: I hired an offshore resource and built a GPT that they can send titles and other identifiers to, and it would say if it was in our ICP or not. I created it for a specific process that has outlined steps and FAQ from that person on things they have encountered, I plan on adding more questions and answers. This was super helpful on saving time on answering questions about titles / improving the results of their work.
- Domain Policy Scan (SPF, DKIM, DMARC): I scan domains and find SPF records and then use an Agent and a prompt to break out all the system tokens from the SPF to understand the systems companies are using. The prompt is a consent work in progress, but I have it done to be really consistent
Both have been really helpful to my overall workflow.
Isn't that just simple glorified workflow automation? Shouldn't "agents" do and decide what to do themselves based on the holy prophecy of VC and AI Startups ?
> Workflows are systems where LLMs and tools are orchestrated through predefined code paths.
> Agents, on the other hand, are systems where LLMs dynamically direct their own processes and tool usage, maintaining control over how they accomplish tasks.
The workflows can be easily updated to be an agent, the scan already creates sales opportunities for me and can be updated to create messages based on active email campaigns that I have created.
Just a buzzword for investors given we peaked with language models.
Chaining different prompts can be useful: calling that agents is purely marketing: these models are pretty dumb and don't have agency. I'd stay away from related frameworks
sales ops here. I was just tasked with figuring out how to use AI to use previous quotes to generate new quotes so sales people don't spend so much time creating quotes. Seems like the perfect thing for an agent. Anyone done this?
That sounds alright, but I'm having difficulty imagining a situation where a business wants to produce a quote with novel element types / parameterizations not yet seen before without a human hand in the loop.
I’m way out in assumption-land, but I’m guessing all quotes would be reviewed by humans, and the goal is to take the drudgery out of first drafts.
For that it would be fine if the AI took a stab at something novel like a faster than usual delivery timeline or higher than usual part tolerances. It might get the economics wrong, but just by including them it would be easier for a human to adjust.
In my pre-sales career, we just did copy and paste for spreadsheets and docs. Most quotes only require finding the nearest recent one and a replace-all for key bits of information followed by careful proof-reading.
Sounds like a poorly thought out requirement. If you are tasked with speeding up the generation of quotes and find that AI can do the job well, that is perfectly reasonable. But if you are told what tool to use to make it happen, whomever tasked you with it doesn't understand that AI is a tool, not a goal. (I say that often enough, I may need to put it on t-shirts.)
For him and his boss and the boss of his boss it may well be a goal to use more AI in business processes. It may be decided in the strategy to spend X% on AI in the next 3 years. So you will do exactly that and not question if it makes sense at all.
I disagree here. It sounds to me like the requirements are clear: Use some AI "agent" to perform this task. That means it should be trained on a particular dataset, and it should perform a particular function. This would be in place of trying to write software to directly do this, just let the AI perform task processing, proposal drafting, document formatting.
We sell maker and STEM education electronics, but the profit margins on products like Raspberry Pis, Micro:bits, and Arduinos are, well, pretty slim. This has pushed us to become extremely efficient; so much so that we ended up creating our own AI-agent-based ERP platform called Koi [1]
In essence, our work is built on the shoulders of giants like OpenAI’s Assistant API, Anthropic and Rails.
One of our standout demos is that certain objects (Orders, Quotes, Supplier Orders, Customers etc) in our database are assigned their own email addresses (using Rails' Action Mailbox[2]). Emails can be forwarded directly to these objects-whether it’s an order, a customer, or a supplier order.
From there, our agent, “Koi,” automatically extracts relevant information from emails and takes appropriate actions. For example, Koi can create a quote, attach a purchase order PDF to an order, or extract tracking information from supplier shipping confirmation emails to provide live tracking updates.
It also works the other way around; you can ask Koi to send a customer their tax invoice or inform them that a product they were interested in is out of stock, seamlessly handling typical customer service tasks.
Previously, we integrated speech-to-text functionality using the Whisper API, which made for an impressive demo.
Now, we’re taking it a step further by rebuilding our speech system to leverage OpenAI’s new WebRTC-based Real-time API. The key advantage here is that it comes with function calling support[3]. We already support a variety of automation features using barcodes[4], allowing users to scan a barcode and have Koi perform specific actions. This has proven to be an ideal area in the application to integrate tool use with the real-time API, creating even more powerful and efficient workflows.
Our ultimate goal is to integrate this system with Bishop, our product-picking robot[5].
Your spiel here is much better than the website you've linked.
What you've linked sounds like you're selling a glorified shipping label printer.
I'm curious how this differs from standard TA/TMS systems that have been around for decades. I work in the space and there are plenty of TA/TMS systems that print shipping labels and fulfil orders, that update stock levels and send out tracking emails + SMS messages, integrate with carriers for shipment updates, that integrate with Shopify, eBay, Etsy, big commerce, etc.
They didn't need AI to do any of that. What's the advantage you're finding?
Here's an example that seems to operate in Australia:
Shipping is a fraction of what the system does. To completely automate shipping you need an understanding of inventory etc. To do automated customer service, you need knowledge of shipping, inventory etc.
I have mentioned this on Twitter recently. My stream there is full of people talking about agents being the future, several posts on how to make them, but almost zero examples of any that they have built or used.
Not sure why it’s a secret, it’s a pretty big limitation, basically means AI agents are just a good tool for problem domains where mistakes can be tolerated or where no better alternative exists because the problem space is too vast to create solutions that work predictably 100% of the time.
Unfortunately I'm one of those who haven't working stuff but hopefully will have one soon enough.
My thought process on agentic work is following- treating them for input-output operations to merge with deterministic processes.
To be more specific- from what I see in my non-tech industry, when you try to implement process management, people are quite good and terrible at implementing agreed processes at the same time. They are great at detecting deviation from process - when exception is needed and terrible to do same thing 1000th time in a row.
So on high level, I think agents should address automation, and detect when there is deviation from the process. In which case a human person should take over.
Tldr - I don't thing agentic workflows without human will be there any time soon. But we will have 2 human + agents replacing 10 human team
Honestly the whole landscape seems broken and unproductive at this point.
Countless vendors, platforms, cloud environments, industry/technical jargon - all with different pricing models, SLAs, tooling, etc etc.
Getting anything usable is a challenge and most orgs spin in a never ending cycle of data integration/normalization work that produces little business value.
My advice to teams now is simplify, reduce, streamline - get to the kernel of what you think you need and protect it all costs. Most of the shiny new objects being pitched as silver bullets are just ways for other people to make money off your margin.