I often hear that AI coding feels inconsistent or generates inadequate results. I am somewhat surprised by this because more often than not, I get pretty good results.
When dealing with any AI agent (or any LLM tool for that matter), there’s really just three things that impact your results eventually:
- context you provide
- prompt you input
- executing in chunks
This might sound discouragingly obvious, but being deliberate with these three factors (every time you send a request to Claude Code, ChatGPT, etc.) makes a noticeable difference in the results you see.
…and it’s straightforward to get this 80% right.
Context
LLMs are world knowledge pocket machines. Every time you want to work on a task, you need to trim that world knowledge pocket machine to a surgical one that’s only focused on the task at hand. You do this by seeding context.
The simplest way to do this, especially for AI Coding:
- System rules & agent instructions: This is basically your
AGENTS.md
file where you briefly explain what the project is, the architecture, conventions used in the repository, and navigation the project1. - Tooling: Lot of folks miss this, but in your AGENTS.md, explicitly point to the commands you use yourself to build, test and verify. I’m a big fan of maintaining a single
Makefile
with the most important commands, that the assistant can invoke easily from the command line. - Real‑time data (MCP): when you need real-time data or connect to external tools, use MCPs. People love to go on about complex MCP setup but don’t over index on this. For e.g. instead of a github MCP just install the
gh
cli command let the agent run these directly. You can burn tokens if you’re not careful with MCPs. But of course, for things like Figma/JIRA where there’s no other obvious connection path, use it liberally.
There are many other ways, and engineering a more elegant way to provide this context is becoming the next frontier for AI development btw2.
Prompt
Think of your prompts as specs, not search queries. For example: “Write me a unit test for this authentication class” 🙅♂️.
Instead of that one-liner, here’s how I would start that same prompt:
Persona:
- You're an expert Android developer well versed with the industry norms and conventions for testing
- You only use Kotlin and JUnit 5, Mockito
Task:
Write the unit tests for @AuthService.kt
Context/Constraints:
- Follow the existing testing patterns as demonstrated in @RuleEngineService.kt
- Start by writing the tests for the three public methods first - `login`, `logout`, `refreshToken`
- Prefer Fakes over Mocks; if we don't have a convenient Fake class, let's add one there
- Remember, never make real network/database calls with these tests
- Make sure to cover happy paths and error cases as well
Output:
- AuthServiceTest.kt in folder <src/test/...>
- Test names: methodName_condition_expectedResult
Verify:
- Use command `make test AuthService` to keep testing just this class
- Do not run lint checks while iterating as it will take a long time
- I need to hit a code coverage of at least 80%.
- You can check coverage for this class with `make test-coverage AuthService`
First propose the plan before you start making changes or coding. Only after I accept, proceed
I have a text expansion snippet aiprompt;
that I start with almost every single time. This reminds me to structure and start any prompt:
Persona:
- {cursor}
Task:
-
Context/Details/Constraints:
-
Output:
-
Verify:
-
This structure forces you to think through the problem and gives the AI what it needs to make good decisions.
Custom commands
But it can get tedious to write these prompts in detail every single time. So you might want to create “command” templates. These are just markdown files that capture your detailed prompts.
This is one of those things that people don’t leverage enough. Especially if your team has a shared folder of commands that everyone is iterating on, you can land up with a powerful set of prompts that you can quickly use to get really good results. I have commands like /write-unit-test.md
, /write-pr-desc.md
, /debug-ticket.md
, /understand-feature.md
etc.
Chunking
AI agents hit limits: context windows fill up, attention drifts, agents start hallucinating, you get poor results. Newer models can run hours‑long coding sessions, but until that is common, the simpler fix is to break work into discrete chunks and plan before coding.
Most developers I talk to seem to miss this. I can’t stress how important this is, especially when you’re working on slightly longer tasks. My post goes into this, and that was the single biggest step‑function improvement in my own AI coding practice.
Briefly this is how I go about it:
Session 1 — Plan only
- Share the high‑level goal and keep going back and forth with the agent
- Don’t think of writing code in this session, just think of it as a session to tell the agent what it’s about to do.
- Once you’re convinced, ask the agent to write the plan out in detail markdown in your
.ai/plans/
folder - Reset context before you start executing
Session 2+ — Execute one task at a time
- Spawn a fresh agent, load
task-1.md
. - Implement only that task, verify & commit.
- Reset or clear your session.
- Proceed to
task-2.md
and repeat.
One‑shot requests force the agent to plan and execute simultaneously - which doesn’t produce great results. If you were to shoot these as PRs to your colleagues for reviewing, how would you break those up. You wouldn’t write 10,000 lines so don’t do that with your agents either.
Plan → chunk → execute → verify.
So the next time you’re not getting good results, ask yourself these three things:
- Am I providing all the necessary context?
- Is my prompt a clear spec?
- Am I executing in small, verifiable chunks?
Discuss this on X