I’d like to bring up a topic that, in my opinion, doesn’t get nearly enough attention:
How do we write prompts that are not only effective, but also resource-efficient and cost-efficient?
Especially when using tools like Codex (e.g. in VS Code), you quickly notice:
Bad prompts = high token usage + frequent interruptions
Good prompts = stable runs + lower costs + better results
Core Idea: Prompting = Resource Management
Every prompt has three cost factors:
- Tokens (money / limits)
- Compute (time / energy)
- Agent complexity (number of internal steps)
Goal: Maximum output with minimal context and minimal steps
In other words:
Ecological prompting = less compute usage
Economical prompting = fewer tokens
The Main Problem: “Agent Explosion”
Many prompts unintentionally trigger this behavior:
- The repo gets scanned
- Multiple files are loaded
- A plan is generated
- The plan is revised multiple times
Result:
- many API calls
- high token usage
- high risk of interruption
Typical problematic prompts:
"Analyze my whole project"
"Improve everything"
"Make this production ready"
Best Practices for Stable Prompting
1.
Aggressively limit scope
Only modify the file auth_service.py.
Do not touch other files.
2.
Provide context directly
Instead of:
Fix my login bug
Use:
Here is the login function:
[paste code]
Fix the bug where empty passwords are accepted.
No repo scan needed = huge win
3.
Disable unnecessary exploration
Do not scan the repository.
Do not load additional files.
This significantly reduces internal agent steps.
4.
Break tasks into steps (very important!)
Instead of:
Analyze, refactor and optimize this system
Use:
Step 1: Identify the bug only.
Step 2: Propose a fix.
Step 3: Implement it.
More control + fewer failures
5.
Limit complexity
Keep changes minimal.
Do not introduce new dependencies.
Limit changes to under 100 lines.
Codex-Specific “Hack” (Very Interesting)
From my observations (and this seems reproducible):
Once a Codex task has started, it often continues running even after you hit your token limit.
This suggests:
- Tokens are mainly charged at request start / new requests
- Running processes are often allowed to complete
Practical Strategy
Start complex tasks shortly before hitting your limit
Why this works:
- The initial request gets accepted
- The task continues internally
- even if you’re “over the limit” afterward
Result:
- large refactorings can complete
- often without additional token cost (in some cases)
Limitations
This does NOT always work if:
- new requests are required (e.g. loading more files)
- the agent needs to re-plan
- the task is too exploratory
So good prompting still matters a lot
Best Combined Pattern
A very stable prompt structure:
Work only on the following file:
[paste code]
Task:
Refactor this function to remove duplication.
Constraints:
- Do not scan other files
- Do not load additional context
- Keep changes minimal
Output:
Provide only the updated code.
This gives you:
- minimal token usage
- low agent overhead
- very low interruption risk
Why This Is Also “Ecological”
Fewer tokens = less compute = less energy consumption.
In large codebases, a bad prompt can:
- use 10x more compute
- waste infrastructure resources
Good prompting is not just efficient — it’s sustainable.
Conclusion
-
Prompting is not just UX, it’s system design
-
Good prompts:
- reduce cost
- increase stability
- prevent interruptions
-
Codex can be used very efficiently — but only with proper prompting
If anyone has similar or opposite experiences (especially with Claude or Gemini), I’d be really interested to hear ![]()