Skip to main content
Claude Code is Anthropic’s CLI tool for agentic coding. This guide shows you how to run it through Venice AI for pay-per-token access to Claude Opus 4.5 and Sonnet 4.5.

Pay Per Token

No subscription. Pay only for what you use

Claude Models

Access Opus 4.5 and Sonnet 4.5 through Venice

Prompt Caching

Venice caching works alongside Claude Code

Why You Need a Router

Claude Code connects directly to Anthropic’s API by default. To use it with Venice, you need claude-code-router, an open-source local proxy that:

Intercepts

Catches Claude Code’s outgoing requests before they reach Anthropic

Transforms

Converts request format and maps model IDs (e.g., claude-opus-45)

Redirects

Forwards requests to Venice at api.venice.ai/api/v1/chat/completions

Prerequisites


Setup

1

Install Claude Code

If you haven’t already, install Anthropic’s Claude Code CLI:
npm install -g @anthropic-ai/claude-code
2

Install the Router

npm install -g claude-code-router
3

Get Your API Key

Generate a key from venice.ai/settings/api. You’ll paste it directly in the config file in the next step.
4

Create Configuration

Create the config directory:
mkdir -p ~/.claude-code-router
Then create ~/.claude-code-router/config.json with your preferred editor:
# Using nano
nano ~/.claude-code-router/config.json

# Or using VS Code
code ~/.claude-code-router/config.json
Paste the following configuration:
{
  "APIKEY": "",
  "LOG": true,
  "LOG_LEVEL": "info",
  "API_TIMEOUT_MS": 600000,
  "HOST": "127.0.0.1",
  "Providers": [
    {
      "name": "venice",
      "api_base_url": "https://api.venice.ai/api/v1/chat/completions",
      "api_key": "your-venice-api-key-here",
      "models": [
        "claude-opus-45",
        "claude-sonnet-45"
      ],
      "transformer": {
        "use": ["anthropic"]
      }
    }
  ],
  "Router": {
    "default": "venice,claude-opus-45",
    "think": "venice,claude-opus-45",
    "background": "venice,claude-opus-45",
    "longContext": "venice,claude-opus-45",
    "longContextThreshold": 100000
  }
}
If you modify config.json while the router is running, restart it with ccr restart to apply changes.
5

Launch

Start the router, then Claude Code:
ccr start
ccr code
Or use the activation method:
eval "$(ccr activate)" && claude

Supported Models

ModelVenice IDBest For
Claude Opus 4.5claude-opus-45Complex reasoning, large refactors
Claude Sonnet 4.5claude-sonnet-45Fast iteration, everyday coding
Claude Code only works with Claude models. While Venice supports GPT, DeepSeek, Grok, and others, Claude Code requires Claude-specific features like extended thinking. For other models, use Venice’s standard API.

Router Features

The router provides several useful features beyond basic routing:
Use the /model command inside Claude Code to switch models without restarting:
/model venice,claude-sonnet-45
Useful when you want Opus for complex tasks and Sonnet for quick iterations.
Prefer a GUI? Launch the web-based config editor:
ccr ui
This opens a browser interface for editing your config.json without touching the file directly.
The Router config section controls which model handles different task types:
ScenarioWhen it’s used
defaultGeneral requests
thinkReasoning-heavy tasks (Plan Mode)
backgroundBackground operations
longContextWhen context exceeds longContextThreshold tokens
You can route different scenarios to different models. For example, use Sonnet for background tasks to save costs.
If something isn’t working, check the logs:
# Server logs (HTTP, API calls)
~/.claude-code-router/logs/ccr-*.log

# Application logs (routing decisions)
~/.claude-code-router/claude-code-router.log
Set "LOG_LEVEL": "debug" in your config for more verbose output.

Caching Behavior

Venice prompt caching works alongside Claude Code’s native cache markers. Venice automatically detects when Claude Code sends cache_control fields and adjusts its caching strategy accordingly.
ScenarioCache TTLWho Controls
Default (recommended)5 minutesClaude Code + Venice
With cleancache transformer1 hourVenice only
The default configuration lets both systems cooperate:
  • Claude Code sends its native cache_control markers
  • Venice adds caching around them with a 5-minute TTL
  • Both systems share the 4-block cache limit
This works well for active coding sessions where you’re making frequent requests.
Add cleancache to the transformer if you:
  • Are hitting the 4-block cache limit errors
  • Experience strange caching behavior
  • Prefer Venice’s 1-hour TTL for longer sessions
"transformer": {
  "use": ["anthropic", "cleancache"]
}
This strips Claude Code’s cache markers, giving Venice full control with a longer TTL.

Resources