Coding Agents 14 May 2026 4 min read

From $0.60 to $0.02: My Journey into Agentic Vibe Coding and Extreme Cost Optimization

Adrian Kuczyński Senior Security Developer

From $0.60 to $0.02: My Journey into Agentic Vibe Coding and Extreme Cost Optimization

In the world of modern software development, we are witnessing a paradigm shift. We’ve moved past simple "chat-with-code" to something far more potent: Agentic Vibe Coding. It’s the art of steering high-level intent while AI agents handle the "how"—the terminal commands, the file searching, and the multi-file refactoring.

But as I discovered on my Arch Linux dev machine while building this-is-adix.dev, this power comes with a price tag. If you aren't careful, a single "vibe" session can cost you as much as a fancy coffee.

Here is the story of how I tamed the beast, optimized my Continue setup, and slashed my API costs by over 90%.

The "Context Bloat" Wake-up Call

I started using Continue with Claude 4.6 Sonnet via OpenRouter. The performance was breathtaking, but the bills were... eye-opening. A simple request to add a theme-aligned popup cost me $0.60.

Why? Because the agent was "blindly" reading full files and I was paying the "Open Tabs Tax." Every time I asked a question, Continue was sending every open file in my IDE as context. Combined with the agent's tendency to be overly chatty, I was burning tokens for "politeness" and redundant code explanations.

The Strategy: The "Silent Detective" Protocol

To fix this, I redesigned my config.yaml and systemMessage around three pillars: Frugality, Autonomy, and Silence.

1. The Detective Rule

Instead of giving the model all the context upfront, I taught it to be a detective. I instructed the agent to use terminal tools (rg, grep, ls) to map the codebase first. It now reads only the specific lines it needs.

2. The Silent Protocol

I stripped away the AI’s "personality." I don't need a "Sure, I can help with that!" I need code changes. I implemented a rule where responses must be under 15 words unless explaining complex logic.

3. The Hybrid Model Approach

Why use a "heavy" model for everything? I moved to a hybrid setup:

Claude 4.6 Sonnet: The "Lead Engineer" for complex refactoring and security logic.
DeepSeek V4 Pro: The "Fast Executor" for boilerplate, unit tests, and—crucially—Tab Autocomplete. It’s 10x cheaper and incredibly snappy.

The Modern `config.yaml`

Here is the "battle-tested" configuration I’m running now. It uses the latest 2026 Continue schema, focusing on Prompt Caching and Agentic Awareness.

YAML

# ~/.continue/config.yaml
ui:
  allowAutomaticallyAddedContext: false # No more "Open Tabs Tax"

models:
  - name: "Claude 4.6 Sonnet (Engineer)"
    provider: openrouter
    model: anthropic/claude-3.5-sonnet
    requestOptions:
      headers:
        "X-OpenRouter-Caching": "true" # Massive savings on repeated context

  - name: "DeepSeek V4 Pro (Worker)"
    provider: openrouter
    model: deepseek/deepseek-v4-pro

tabAutocompleteModel:
  title: "DeepSeek Coder"
  provider: openrouter
  model: deepseek/deepseek-v4-pro

context:
  - provider: os      # Agent knows it's on Arch Linux
  - provider: diff    # Essential for cheap, effective Code Reviews
  - provider: repo-map
    params:
      includeSignatures: false # Just the map, save those tokens!

The System Message: My Secret Sauce

This is what keeps the agent in check. It’s a "Token-Frugal" instruction set that forces the model to act instead of talk.

XML

<important_rules>
  You are a Token-Frugal, Silent Agent.
  1. THE SILENCE RULE: Responses under 15 words. No filler. No pleasantries.
  2. TOOL-CENTRIC: Always 'grep' or 'rg' before reading. Never 'cat' a 3000-line file.
  3. FORBIDDEN: Do not repeat code in chat that you are already applying via 'edit' tools.
</important_rules>

The Result: Real-world Testing

I put this setup to the test with a security task: Implementing a multi-step password change feature.

The Implementation: The agent used grep to find existing auth patterns, copied the BCRYPT cost settings, updated the routes, and built the frontend. Total cost: $0.29.
The Code Review: I used the @git-diff provider to ask for a review of the changes.
- Before Optimization: This would have sent the whole controller and cost ~$0.16.
- After Optimization: It sent only the diff. Total cost: $0.02.

Final Thoughts

Vibe coding isn't about being lazy; it's about being an orchestrator. By treating context as a precious resource and moving to an agentic "pull" model (where the AI finds info) rather than a "push" model (where you give it everything), you can build complex systems for pennies.

Stay lean, stay silent, and let the agents do the heavy lifting.

Happy (Frugal) Coding!

From $0.60 to $0.02: My Journey into Agentic Vibe Coding and Extreme Cost Optimization

The "Context Bloat" Wake-up Call

The Strategy: The "Silent Detective" Protocol

1. The Detective Rule

2. The Silent Protocol

3. The Hybrid Model Approach

The Modern `config.yaml`

The System Message: My Secret Sauce

The Result: Real-world Testing

Final Thoughts

Discussion

Read Next

Python Course Part 10: The Capstone Project

Python Course Part 9: Quality Control — Testing & Type Hints

The "Context Bloat" Wake-up Call

The Strategy: The "Silent Detective" Protocol

1. The Detective Rule

2. The Silent Protocol

3. The Hybrid Model Approach

The Modern config.yaml

The System Message: My Secret Sauce

The Result: Real-world Testing

Final Thoughts

Discussion

Read Next

Python Course Part 10: The Capstone Project

Python Course Part 9: Quality Control — Testing & Type Hints

The Modern `config.yaml`