Claude Code with Venice

Claude Code is Anthropic’s CLI tool for agentic coding. This guide shows you how to run it through Venice AI for pay-per-token access to Claude Opus 4.5/4.6 and Sonnet 4.5/4.6.

Pay Per Token

No subscription. Pay only for what you use

Claude Models

Access Opus 4.5/4.6 and Sonnet 4.5/4.6 through Venice

Prompt Caching

Venice caching works alongside Claude Code

Why You Need a Router

Claude Code connects directly to Anthropic’s API by default. To use it with Venice, you need claude-code-router, an open-source local proxy that:

Intercepts

Catches Claude Code’s outgoing requests before they reach Anthropic

Transforms

Converts request format and maps model IDs (e.g., claude-opus-45)

Redirects

Forwards requests to Venice at api.venice.ai/api/v1/chat/completions

Prerequisites

Venice Account

With API credits

Node.js

v18 or higher

Claude Code

Installed via npm

Setup

Install Claude Code

If you haven’t already, install Anthropic’s Claude Code CLI:

npm install -g @anthropic-ai/claude-code

Install the Router

npm install -g @musistudio/claude-code-router

Get Your API Key

Generate a key from venice.ai/settings/api. You’ll paste it directly in the config file in the next step.

Create Configuration

Create the config directory:

mkdir -p ~/.claude-code-router

Then create ~/.claude-code-router/config.json with your preferred editor:

# Using nano
nano ~/.claude-code-router/config.json

# Or using VS Code
code ~/.claude-code-router/config.json

Paste the following configuration:

{
  "APIKEY": "",
  "LOG": true,
  "LOG_LEVEL": "info",
  "API_TIMEOUT_MS": 600000,
  "HOST": "127.0.0.1",
  "Providers": [
    {
      "name": "venice",
      "api_base_url": "https://api.venice.ai/api/v1/chat/completions",
      "api_key": "your-venice-api-key-here",
      "models": [
        "claude-opus-45",
        "claude-sonnet-45",
        "claude-opus-4-6",
        "claude-sonnet-4-6"
      ],
      "transformer": {
        "use": ["anthropic"]
      }
    }
  ],
  "Router": {
    "default": "venice,claude-opus-45",
    "think": "venice,claude-opus-45",
    "background": "venice,claude-opus-45",
    "longContext": "venice,claude-opus-45",
    "longContextThreshold": 100000
  }
}

If you modify config.json while the router is running, restart it with ccr restart to apply changes.

Launch

Start the router, then Claude Code:

ccr start
ccr code

Or use the activation method:

eval "$(ccr activate)" && claude

Supported Models

Model	Venice ID	Best For
Claude Opus 4.5	`claude-opus-45`	Complex reasoning, large refactors
Claude Sonnet 4.5	`claude-sonnet-45`	Fast iteration, everyday coding
Claude Opus 4.6	`claude-opus-4-6`	Complex reasoning, large refactors
Claude Sonnet 4.6	`claude-sonnet-4-6`	Fast iteration, everyday coding

Claude Code is optimized for Claude models. While other models available through Venice (GPT, DeepSeek, Grok, etc.) may work, we cannot guarantee an equivalent experience since Claude Code relies on Claude-specific features like extended thinking. For other models, consider using Venice’s standard API.

Router Features

The router provides several useful features beyond basic routing:

Switch models on the fly

Use the /model command inside Claude Code to switch models without restarting:

/model venice,claude-sonnet-45

Useful when you want Opus for complex tasks and Sonnet for quick iterations.

Visual configuration with UI mode

Prefer a GUI? Launch the web-based config editor:

ccr ui

This opens a browser interface for editing your config.json without touching the file directly.

Router scenarios explained

The Router config section controls which model handles different task types:

Scenario	When it’s used
`default`	General requests
`think`	Reasoning-heavy tasks (Plan Mode)
`background`	Background operations
`longContext`	When context exceeds `longContextThreshold` tokens

You can route different scenarios to different models. For example, use Sonnet for background tasks to save costs.

Debugging with logs

If something isn’t working, check the logs:

# Server logs (HTTP, API calls)
~/.claude-code-router/logs/ccr-*.log

# Application logs (routing decisions)
~/.claude-code-router/claude-code-router.log

Set "LOG_LEVEL": "debug" in your config for more verbose output.

Caching Behavior

Venice prompt caching works alongside Claude Code’s native cache markers. Venice automatically detects when Claude Code sends cache_control fields and adjusts its caching strategy accordingly.

Scenario	Cache TTL	Who Controls
Default (recommended)	5 minutes	Claude Code + Venice
With `cleancache` transformer	1 hour	Venice only

When NOT to use cleancache (most users)

The default configuration lets both systems cooperate:

Claude Code sends its native cache_control markers
Venice adds caching around them with a 5-minute TTL
Both systems share the 4-block cache limit

This works well for active coding sessions where you’re making frequent requests.

When to use cleancache

Add cleancache to the transformer if you:

Are hitting the 4-block cache limit errors
Experience strange caching behavior
Prefer Venice’s 1-hour TTL for longer sessions

"transformer": {
  "use": ["anthropic", "cleancache"]
}

This strips Claude Code’s cache markers, giving Venice full control with a longer TTL.

Overview

Guides

Pay Per Token

Claude Models

Prompt Caching

Why You Need a Router

Prerequisites

Venice Account

Node.js

Claude Code

Setup

Supported Models

Router Features

Caching Behavior

Resources

Venice API Docs

claude-code-router

Overview

Guides

Pay Per Token

Claude Models

Prompt Caching

​Why You Need a Router

​Prerequisites

Venice Account

Node.js

Claude Code

​Setup

​Supported Models

​Router Features

​Caching Behavior

​Resources

Venice API Docs

claude-code-router

Why You Need a Router

Prerequisites

Setup

Supported Models

Router Features

Caching Behavior

Resources