> ask it explicitly to separate the implementation plan in phases
This has made a big difference my side. prompt.md that is mostly very natural language markdown. Then ask LLM to turn that into a plan.md that contains phases emphasising that each should be fairly selfcontained. This usually needs some editing but is mostly fine. And then just have it implement each phase one by one.
Not sure, after reading so many times that Cursor was cooked, I got a license from my company and I'm loving it. I had tried Claude Code before, though only briefly and for small things; I don't really see much difference between one and the other. Cursor (Opus 4.5) has been able to perform complex changes across multiple files, implement whole new features, fix issues in code and project setup... I mean, it just feels like peer programming, and I never got the feeling of running into hard limits. Am I missing much, or Cursor has simply improved recently (or it depends on the license)?
> Second, and no less important, AI Studio is genuinely the best chat interface on the market. It was the first platform where you could edit any message in the conversation, not just the last one, and I think it's still the only platform where you can edit AI responses as well! So if the model goes on an unnecessary tangent, you can just remove it from the context. It's still the only platform where if you have a long conversation like R(equest)1, O(utput)1, R2, O2, R3, O3, R4, O4, R5, O5, you can click regenerate on R3 and it will only regenerate O3, keeping R4 and all subsequent messages intact.
> you can click regenerate on R3 and it will only regenerate O3, keeping R4 and all subsequent messages intact.
What's a use case for this? I'm trying to imagine why you'd want that, but I can't see it. Is it for the horny people? If you're trying to do anything useful, having messages edited should re-generate the following conversation as well (tool calls, etc).
Im sceptical of these google made Ai builders, I just had a bad experience with firebase studio that was stuck on a vulnerable version of nextjs and gemini couldn't update it to a non vulnerable version properly.
Its tries to force vendor lock in from the start. Guh.. avoid.
"You should remain in charge, and best way to do that is to either not use agentic workflows at all (just talk to Gemini 2.5/3 Pro in AI Studio) or use OpenCode, which is like Claude Code, but it shows you all the code changes in git diff format, and I honestly can't understand how anyone would settle for anything else."
I 100% agree with the author here. Most of the "LLMs are slowing me down/are trash/etc" discussions I've had at work usually come from people who are not great developers to begin with - they end up tangled into a net of barely vetted code that was generated for them.
> Most of the "LLMs are slowing me down/are trash/etc" discussions I've had at work usually come from people who are not great developers to begin with
This seems to be something both sides of the debate agree on: Their opponents are wrong because they are subpar developers.
It seems uncharitable to me in both cases, and of course it is a textbook example of an ad hominem fallacy.
> Most of the "LLMs are slowing me down/are trash/etc" discussions I've had at work usually come from people who are not great developers to begin with - they end up tangled into a net of barely vetted code that was generated for them.
This might be your anecdotal experience but in mine, reviewing large diffs of (unvetted agent-written) code is usually not much faster than writing it yourself (especially when you have some mileage in the codebase), nor does it offset the mental burden of thinking how things interconnect and what the side effects might be.
What IMO moves the needle towards slower is that you have to steer the robot (often back and forth to keep it from undoing its own previous changes). You can say it's bad prompting but there's no guarantee that a certain prompt will yield the desired results.
I think it’s actually a combination of people who have seen bad results from ai code generation (and have not looked deeper or figured out how to wield it properly yet) and another segment of the developer population who are now feeling threatened because it’s doing stuff they can’t do. Different groups
I too have observed that aider seems to use significantly less context than claude code though I have found myself drifting from its use more and more in favor of claude code as skills and such have been added. I may have to revisit it soon. What are you using now instead (as you had said sometimes, depending)?
Absolutely - one of my favorite uses of Aider is telling it to edit config files/small utility scripts for me. It has prompted me to write more comments and more descriptive variable names to make the process smoother, which is a win by itself. I just wish it could analyze project structure as well as Claude Code... if I end up less work at work I might try to poke at that part of the code.
This seems like I just read an advertisement. Or submarine article as PG would say.
AI studio is just another IDE like cursor so its a very odd choice to say one is bad and the other is the holy grail:)
But I guess this is what guerilla advertising is these days.
Just another random account with 8 karma points that just happens to post an article about how one IDE is bad and its almost identical cousin is the best
Gemini's large context window is incredible. I concatenate the my entire repo and repos of supporting libraries and then ask it questions.
My last use case was like this : I had a old codebase code that was using bakbone.js for ui with jquery and a bunch of old js with little documentation to generat UI for a clojure web application.
Gemini was able to unravel this hairball of code and guiding me step by step to htmx. I am not using AI studio; I am using Gemini subscription.
Since I manually patch the code, its like pair programming with an incredibly patient and smart programmer.
For the record, I am too old for vibe coding .. I like to maintain total control over my code and all the abstractions and logic.
This article makes a lot of definitive claims about capabilities of different models that don't align with my experience with them. Its hard to take any claim serious without completely understanding the state of the context when the behavior was observed. I don't think its useful to extrapolate a single observation into generalized knowledge about a particular model.
Can't wait until we have useful heuristics for comparing LLM's. This is a problem that comes up constantly (especially in HN comments...)
> and AI Studio is the only serious product for human-in-the-loop SWE
Disagree. I use Claude Code and Codex daily, and I couldn’t be happier. Had started with Cursor, switched to CLI based agents and never looked back. I use WezTerm, tmux, neovim, Zoxide, and create several tabs and panes and run claude code not only for vibe coding, scripting, analysing files, letting it write concepts, texts, documentation. Totally different kind of computing experience. As if I have several assistants 24/7 at my fingertips.
> ask it explicitly to separate the implementation plan in phases
This has made a big difference my side. prompt.md that is mostly very natural language markdown. Then ask LLM to turn that into a plan.md that contains phases emphasising that each should be fairly selfcontained. This usually needs some editing but is mostly fine. And then just have it implement each phase one by one.
Not sure, after reading so many times that Cursor was cooked, I got a license from my company and I'm loving it. I had tried Claude Code before, though only briefly and for small things; I don't really see much difference between one and the other. Cursor (Opus 4.5) has been able to perform complex changes across multiple files, implement whole new features, fix issues in code and project setup... I mean, it just feels like peer programming, and I never got the feeling of running into hard limits. Am I missing much, or Cursor has simply improved recently (or it depends on the license)?
This is a fairly well written article which captures the current state of the art correctly.
And then goes on to recommend AI Studio is a primary dev tool?! Baffling.
There is a rationale:
> Second, and no less important, AI Studio is genuinely the best chat interface on the market. It was the first platform where you could edit any message in the conversation, not just the last one, and I think it's still the only platform where you can edit AI responses as well! So if the model goes on an unnecessary tangent, you can just remove it from the context. It's still the only platform where if you have a long conversation like R(equest)1, O(utput)1, R2, O2, R3, O3, R4, O4, R5, O5, you can click regenerate on R3 and it will only regenerate O3, keeping R4 and all subsequent messages intact.
> you can click regenerate on R3 and it will only regenerate O3, keeping R4 and all subsequent messages intact.
What's a use case for this? I'm trying to imagine why you'd want that, but I can't see it. Is it for the horny people? If you're trying to do anything useful, having messages edited should re-generate the following conversation as well (tool calls, etc).
Im sceptical of these google made Ai builders, I just had a bad experience with firebase studio that was stuck on a vulnerable version of nextjs and gemini couldn't update it to a non vulnerable version properly. Its tries to force vendor lock in from the start. Guh.. avoid.
It's advertising for AI studio, masquerading as an insightful article.
"You should remain in charge, and best way to do that is to either not use agentic workflows at all (just talk to Gemini 2.5/3 Pro in AI Studio) or use OpenCode, which is like Claude Code, but it shows you all the code changes in git diff format, and I honestly can't understand how anyone would settle for anything else."
I 100% agree with the author here. Most of the "LLMs are slowing me down/are trash/etc" discussions I've had at work usually come from people who are not great developers to begin with - they end up tangled into a net of barely vetted code that was generated for them.
> Most of the "LLMs are slowing me down/are trash/etc" discussions I've had at work usually come from people who are not great developers to begin with
This seems to be something both sides of the debate agree on: Their opponents are wrong because they are subpar developers.
It seems uncharitable to me in both cases, and of course it is a textbook example of an ad hominem fallacy.
> Most of the "LLMs are slowing me down/are trash/etc" discussions I've had at work usually come from people who are not great developers to begin with - they end up tangled into a net of barely vetted code that was generated for them.
This might be your anecdotal experience but in mine, reviewing large diffs of (unvetted agent-written) code is usually not much faster than writing it yourself (especially when you have some mileage in the codebase), nor does it offset the mental burden of thinking how things interconnect and what the side effects might be.
What IMO moves the needle towards slower is that you have to steer the robot (often back and forth to keep it from undoing its own previous changes). You can say it's bad prompting but there's no guarantee that a certain prompt will yield the desired results.
I think it’s actually a combination of people who have seen bad results from ai code generation (and have not looked deeper or figured out how to wield it properly yet) and another segment of the developer population who are now feeling threatened because it’s doing stuff they can’t do. Different groups
I use Claude Code within Pycharm and I see the git diff format for changes there.
EDIT: It shows the side-by-side view by default, but it is easy to toggle to a unified view. There's probably a way to permanently set this somewhere.
> which is like Claude Code, but it shows you all the code changes in git diff format
Claude Code does this, you just have to not click “Yes and accept all changes”
This is a part of why I (sometimes, depending) still use Aider. It’s a more manual AI coding process.
I also like how it uses git, and it’s good at using less context (tool calling eats context like crazy!)
I too have observed that aider seems to use significantly less context than claude code though I have found myself drifting from its use more and more in favor of claude code as skills and such have been added. I may have to revisit it soon. What are you using now instead (as you had said sometimes, depending)?
Absolutely - one of my favorite uses of Aider is telling it to edit config files/small utility scripts for me. It has prompted me to write more comments and more descriptive variable names to make the process smoother, which is a win by itself. I just wish it could analyze project structure as well as Claude Code... if I end up less work at work I might try to poke at that part of the code.
This seems like I just read an advertisement. Or submarine article as PG would say.
AI studio is just another IDE like cursor so its a very odd choice to say one is bad and the other is the holy grail:)
But I guess this is what guerilla advertising is these days.
Gemini's large context window is incredible. I concatenate the my entire repo and repos of supporting libraries and then ask it questions.
My last use case was like this : I had a old codebase code that was using bakbone.js for ui with jquery and a bunch of old js with little documentation to generat UI for a clojure web application.
Gemini was able to unravel this hairball of code and guiding me step by step to htmx. I am not using AI studio; I am using Gemini subscription.
Since I manually patch the code, its like pair programming with an incredibly patient and smart programmer.
For the record, I am too old for vibe coding .. I like to maintain total control over my code and all the abstractions and logic.
This article makes a lot of definitive claims about capabilities of different models that don't align with my experience with them. Its hard to take any claim serious without completely understanding the state of the context when the behavior was observed. I don't think its useful to extrapolate a single observation into generalized knowledge about a particular model.
Can't wait until we have useful heuristics for comparing LLM's. This is a problem that comes up constantly (especially in HN comments...)
> The context is king
Agree
> and AI Studio is the only serious product for human-in-the-loop SWE
Disagree. I use Claude Code and Codex daily, and I couldn’t be happier. Had started with Cursor, switched to CLI based agents and never looked back. I use WezTerm, tmux, neovim, Zoxide, and create several tabs and panes and run claude code not only for vibe coding, scripting, analysing files, letting it write concepts, texts, documentation. Totally different kind of computing experience. As if I have several assistants 24/7 at my fingertips.