I often hear AI coding feels inconsistent or underwhelming. I’m surprised by this because more often than not, I get good results.
When working with any AI agent (or any LLM tool), there are really just three things that drive your results:
- the context you provide
- the prompt you write
- executing in chunks
This may sound discouragingly obvious, but being deliberate about these three (every time you send a request to Claude Code, ChatGPT etc.) makes a noticeable difference.
…and it’s straightforward to get 80% of this right.
Context #
LLMs are pocket‑sized world knowledge machines. Every time you work on a task, you need to trim that machine to a surgical one that’s only focused on the task at hand. You do this by seeding context.
The simplest way to do this, especially for AI Coding:
- System rules & agent instructions: This is basically your
AGENTS.mdfile where you briefly explain what the project is, the architecture, conventions used in the repository, and navigation the project1. - Tooling: Lot of folks miss this, but in your AGENTS.md, explicitly point to the commands you use yourself to build, test and verify. I’m a big fan of maintaining a single
Makefilewith the most important commands, that the assistant can invoke easily from the command line. - Real‑time data (MCP): when you need real-time data or connect to external tools, use MCPs. People love to go on about complex MCP setup but don’t over index on this. For e.g. instead of a github MCP just install the
ghcli command let the agent run these directly. You can burn tokens if you’re not careful with MCPs. But of course, for things like Figma/JIRA where there’s no other obvious connection path, use it liberally.
There are many other ways, and engineering better context delivery is fast becoming the next frontier in AI development2.
Prompt #
Think of prompts as specs, not search queries. For example: ‘Write me a unit test for this authentication class’ 🙅♂️.
Instead of that one‑liner, here’s how I would start that same prompt:
Persona:
- You're an expert Android developer, well‑versed in industry norms and testing conventions
- You only use Kotlin, JUnit 5, and Mockito
Task:
Write the unit tests for @AuthService.kt
Context/Constraints:
- Follow the existing testing patterns as demonstrated in @RuleEngineService.kt
- Start by writing the tests for the three public methods first - `login`, `logout`, `refreshToken`
- Prefer fakes over mocks; if we don't have a convenient fake class, add one
- Remember, never make real network or database calls in these tests
- Make sure to cover happy paths and error cases as well
Output:
- AuthServiceTest.kt in folder <src/test/...>
- Test names: methodName_condition_expectedResult
Verify:
- Use `make test AuthService` to test just this class
- Don't run lint while iterating; it will take a long time
- I need to hit a code coverage of at least 80%.
- You can check coverage for this class with `make test-coverage AuthService`
First propose a plan before you start making changes or coding. Proceed only after I accept
I use a text‑expansion snippet, aiprompt;, almost every single time. It reminds me to structure any prompt:
Persona:
- {cursor}
Task:
-
Context/Details/Constraints:
-
Output:
-
Verify:
-
This structure forces you to think through the problem and gives the AI what it needs to make good decisions.
Custom commands #
Writing detailed prompts every single time gets tedious. So you might want to create “command” templates. These are just markdown files that capture your detailed prompts.
People don’t leverage this enough. If your team maintains a shared folder of commands that everyone iterates on, you end up with a powerful set of prompts you can quickly reuse for strong results. I have commands like /write-unit-test.md, /write-pr-desc.md, /debug-ticket.md, /understand-feature.md etc.
Chunking #
AI agents hit limits: context windows fill up, attention drifts, hallucinations creep in, results suffer. Newer models can run hours‑long coding sessions, but until that’s common, the simpler fix is to break work into discrete chunks and plan before coding.
Many developers miss this. I can’t stress how important it is, especially when you’re working on longer tasks. My post covers this; it was the single biggest step‑function improvement in my own AI coding practice.
Briefly, here’s how I go about it:
Session 1 — Plan only #
- Share the high‑level goal and iterate with the agent
- Don’t write code in this session; use it to tell the agent what it’s about to do.
- Once you’re convinced, ask the agent to write the plan in detailed markdown in your
.ai/plans/folder - Reset context before you start executing
Session 2+ — Execute one task at a time #
- Spawn a fresh agent, load
task-1.md. - Implement only that task, verify & commit.
- Reset or clear your session.
- Proceed to
task-2.mdand repeat.
One‑shot requests force the agent to plan and execute simultaneously — which rarely produces great results. If you were to submit these as PRs to your colleagues for review, how would you break them up? You wouldn’t ship 10,000 lines, so don’t do that with your agents either.
Plan → chunk → execute → verify.
So the next time you’re not getting good results, ask yourself these three things:
- Am I providing all the necessary context?
- Is my prompt a clear spec?
- Am I executing in small, verifiable chunks?
Discuss this on X + LinkedIn