🌱 Ecological & Economical Prompting — and How to Use Codex Efficiently

kevinveenbirkenbach · April 2, 2026, 9:45am

I’d like to bring up a topic that, in my opinion, doesn’t get nearly enough attention:
How do we write prompts that are not only effective, but also resource-efficient and cost-efficient?

Especially when using tools like Codex (e.g. in VS Code), you quickly notice:

Bad prompts = high token usage + frequent interruptions
Good prompts = stable runs + lower costs + better results

Core Idea: Prompting = Resource Management

Every prompt has three cost factors:

Tokens (money / limits)
Compute (time / energy)
Agent complexity (number of internal steps)

Goal: Maximum output with minimal context and minimal steps

In other words:

Ecological prompting = less compute usage
Economical prompting = fewer tokens

The Main Problem: “Agent Explosion”

Many prompts unintentionally trigger this behavior:

The repo gets scanned
Multiple files are loaded
A plan is generated
The plan is revised multiple times

Result:

many API calls
high token usage
high risk of interruption

Typical problematic prompts:

"Analyze my whole project"
"Improve everything"
"Make this production ready"

Best Practices for Stable Prompting

1. Aggressively limit scope

Only modify the file auth_service.py.
Do not touch other files.

2. Provide context directly

Instead of:

Fix my login bug

Use:

Here is the login function:
[paste code]

Fix the bug where empty passwords are accepted.

No repo scan needed = huge win

3. Disable unnecessary exploration

Do not scan the repository.
Do not load additional files.

This significantly reduces internal agent steps.

4. Break tasks into steps (very important!)

Instead of:

Analyze, refactor and optimize this system

Use:

Step 1: Identify the bug only.
Step 2: Propose a fix.
Step 3: Implement it.

More control + fewer failures

5. Limit complexity

Keep changes minimal.
Do not introduce new dependencies.
Limit changes to under 100 lines.

Codex-Specific “Hack” (Very Interesting)

From my observations (and this seems reproducible):

Once a Codex task has started, it often continues running even after you hit your token limit.

This suggests:

Tokens are mainly charged at request start / new requests
Running processes are often allowed to complete

Practical Strategy

Start complex tasks shortly before hitting your limit

Why this works:

The initial request gets accepted
The task continues internally
even if you’re “over the limit” afterward

Result:

large refactorings can complete
often without additional token cost (in some cases)

Limitations

This does NOT always work if:

new requests are required (e.g. loading more files)
the agent needs to re-plan
the task is too exploratory

So good prompting still matters a lot

Best Combined Pattern

A very stable prompt structure:

Work only on the following file:

[paste code]

Task:
Refactor this function to remove duplication.

Constraints:
- Do not scan other files
- Do not load additional context
- Keep changes minimal

Output:
Provide only the updated code.

This gives you:

minimal token usage
low agent overhead
very low interruption risk

Why This Is Also “Ecological”

Fewer tokens = less compute = less energy consumption.

In large codebases, a bad prompt can:

use 10x more compute
waste infrastructure resources

Good prompting is not just efficient — it’s sustainable.

Conclusion

Prompting is not just UX, it’s system design
Good prompts:
- reduce cost
- increase stability
- prevent interruptions
Codex can be used very efficiently — but only with proper prompting

If anyone has similar or opposite experiences (especially with Claude or Gemini), I’d be really interested to hear