Run Claude Code with Any Model: Ollama Now Supports Anthropic API

If you've been using Claude Code for AI-assisted coding, here's exciting news: Ollama v0.14.0 now supports the Anthropic Messages API.

This means you can run Claude Code with open-source models on your machine—or connect to cloud models through ollama.com. No more vendor lock-in.

Getting Started

1. Install Claude Code

macOS, Linux, WSL:

curl -fsSL https://claude.ai/install.sh | bash

Windows PowerShell:

irm https://claude.ai/install.ps1 | iex

Windows CMD:

curl -fsSL https://claude.ai/install.cmd -o install.cmd && install.cmd && del install.cmd

2. Connect to Ollama

Set these environment variables:

export ANTHROPIC_AUTH_TOKEN=ollama
export ANTHROPIC_BASE_URL=http://localhost:11434

Run Claude Code with your chosen model:

claude --model gpt-oss:20b

Cloud models work too:

claude --model glm-4.7:cloud

Pro tip: Use models with at least 32K context length for best results. Cloud models run at full context by default.

Recommended Models for Coding

Local models:

gpt-oss:20b
qwen3-coder

Cloud models:

glm-4.7:cloud
minimax-m2.1:cloud

Using the Anthropic SDK

Already building with Anthropic SDK? Just change the base URL.

Python:

import anthropic

client = anthropic.Anthropic(
    base_url='http://localhost:11434',
    api_key='ollama',  # required but ignored
)

message = client.messages.create(
    model='qwen3-coder',
    messages=[
        {'role': 'user', 'content': 'Write a function to check if a number is prime'}
    ]
)
print(message.content[0].text)

JavaScript:

import Anthropic from '@anthropic-ai/sdk'

const anthropic = new Anthropic({
  baseURL: 'http://localhost:11434',
  apiKey: 'ollama',
})

const message = await anthropic.messages.create({
  model: 'qwen3-coder',
  messages: [{ role: 'user', content: 'Write a function to check if a number is prime' }],
})

console.log(message.content[0].text)

Tool Calling Works Too

import anthropic

client = anthropic.Anthropic(
    base_url='http://localhost:11434',
    api_key='ollama',
)

message = client.messages.create(
    model='qwen3-coder',
    tools=[
        {
            'name': 'get_weather',
            'description': 'Get the current weather in a location',
            'input_schema': {
                'type': 'object',
                'properties': {
                    'location': {
                        'type': 'string',
                        'description': 'The city and state, e.g. San Francisco, CA'
                    }
                },
                'required': ['location']
            }
        }
    ],
    messages=[{'role': 'user', 'content': "What's the weather in San Francisco?"}]
)

for block in message.content:
    if block.type == 'tool_use':
        print(f'Tool: {block.name}')
        print(f'Input: {block.input}')

Supported Features

Messages and multi-turn conversations
Streaming
System prompts
Tool calling / function calling
Extended thinking
Vision (image input)

The ability to run Claude Code with local models opens up interesting possibilities: private coding assistance, offline development, and experimenting with different models without changing your workflow.

What model will you try first?