I often hear AI coding feels inconsistent or underwhelming. I’m surprised by this because more often than not, I get good results.

When working with any AI agent (or any LLM tool), there are really just three things that drive your results:

  1. the context you provide
  2. the prompt you write
  3. executing in chunks

This may sound discouragingly obvious, but being deliberate about these three (every time you send a request to Claude Code, ChatGPT etc.) makes a noticeable difference.

…and it’s straightforward to get 80% of this right.

Context #

LLMs are pocket‑sized world knowledge machines. Every time you work on a task, you need to trim that machine to a surgical one that’s only focused on the task at hand. You do this by seeding context.

The simplest way to do this, especially for AI Coding:

There are many other ways, and engineering better context delivery is fast becoming the next frontier in AI development2.

Prompt #

Think of prompts as specs, not search queries. For example: ‘Write me a unit test for this authentication class’ 🙅‍♂️.

Instead of that one‑liner, here’s how I would start that same prompt:

Persona:
- You're an expert Android developer, well‑versed in industry norms and testing conventions
- You only use Kotlin, JUnit 5, and Mockito

Task:
Write the unit tests for @AuthService.kt

Context/Constraints:
- Follow the existing testing patterns as demonstrated in @RuleEngineService.kt
- Start by writing the tests for the three public methods first - `login`, `logout`, `refreshToken`
- Prefer fakes over mocks; if we don't have a convenient fake class, add one
  - Remember, never make real network or database calls in these tests
- Make sure to cover happy paths and error cases as well

Output:
- AuthServiceTest.kt in folder <src/test/...>
- Test names: methodName_condition_expectedResult

Verify:
- Use `make test AuthService` to test just this class
- Don't run lint while iterating; it will take a long time
- I need to hit a code coverage of at least 80%.
  - You can check coverage for this class with `make test-coverage AuthService`

First propose a plan before you start making changes or coding. Proceed only after I accept

I use a text‑expansion snippet, aiprompt;, almost every single time. It reminds me to structure any prompt:

Persona:
- {cursor}

Task:
-

Context/Details/Constraints:
-

Output:
-

Verify:
-

This structure forces you to think through the problem and gives the AI what it needs to make good decisions.

Custom commands #

Writing detailed prompts every single time gets tedious. So you might want to create “command” templates. These are just markdown files that capture your detailed prompts.

People don’t leverage this enough. If your team maintains a shared folder of commands that everyone iterates on, you end up with a powerful set of prompts you can quickly reuse for strong results. I have commands like /write-unit-test.md, /write-pr-desc.md, /debug-ticket.md, /understand-feature.md etc.

Chunking #

AI agents hit limits: context windows fill up, attention drifts, hallucinations creep in, results suffer. Newer models can run hours‑long coding sessions, but until that’s common, the simpler fix is to break work into discrete chunks and plan before coding.

Many developers miss this. I can’t stress how important it is, especially when you’re working on longer tasks. My post covers this; it was the single biggest step‑function improvement in my own AI coding practice.

Briefly, here’s how I go about it:

Session 1 — Plan only #

Session 2+ — Execute one task at a time #

One‑shot requests force the agent to plan and execute simultaneously — which rarely produces great results. If you were to submit these as PRs to your colleagues for review, how would you break them up? You wouldn’t ship 10,000 lines, so don’t do that with your agents either.

Plan → chunk → execute → verify.


So the next time you’re not getting good results, ask yourself these three things:

  1. Am I providing all the necessary context?
  2. Is my prompt a clear spec?
  3. Am I executing in small, verifiable chunks?

  1. I wrote a post about this btw, on consolidating these instructions for various agents and tools. ↩︎

  2. Anthropic’s recent post on “context engineering” is a good overview of techniques. ↩︎