Chạy Claude Code với bất kỳ model nào: Ollama đã hỗ trợ Anthropic API

Nếu bạn đang dùng Claude Code để code với AI, có tin hay: Ollama v0.14.0 đã hỗ trợ Anthropic Messages API.

Nghĩa là bạn có thể chạy Claude Code với model open-source ngay trên máy—hoặc kết nối tới cloud models qua ollama.com. Không còn bị phụ thuộc vào một nhà cung cấp nữa.

Bắt đầu thôi

1. Cài Claude Code

macOS, Linux, WSL:

curl -fsSL https://claude.ai/install.sh | bash

Windows PowerShell:

irm https://claude.ai/install.ps1 | iex

Windows CMD:

curl -fsSL https://claude.ai/install.cmd -o install.cmd && install.cmd && del install.cmd

2. Kết nối Ollama

Set environment variables:

export ANTHROPIC_AUTH_TOKEN=ollama
export ANTHROPIC_BASE_URL=http://localhost:11434

Chạy Claude Code với model bạn chọn:

claude --model gpt-oss:20b

Cloud models cũng được:

claude --model glm-4.7:cloud

Lưu ý: Nên dùng model có ít nhất 32K context length để kết quả tốt nhất. Cloud models mặc định chạy full context.

Model nào phù hợp cho coding?

Chạy local:

gpt-oss:20b
qwen3-coder

Chạy cloud:

glm-4.7:cloud
minimax-m2.1:cloud

Đang dùng Anthropic SDK?

Chỉ cần đổi base URL là xong.

Python:

import anthropic

client = anthropic.Anthropic(
    base_url='http://localhost:11434',
    api_key='ollama',  # bắt buộc nhưng không dùng
)

message = client.messages.create(
    model='qwen3-coder',
    messages=[
        {'role': 'user', 'content': 'Write a function to check if a number is prime'}
    ]
)
print(message.content[0].text)

JavaScript:

import Anthropic from '@anthropic-ai/sdk'

const anthropic = new Anthropic({
  baseURL: 'http://localhost:11434',
  apiKey: 'ollama',
})

const message = await anthropic.messages.create({
  model: 'qwen3-coder',
  messages: [{ role: 'user', content: 'Write a function to check if a number is prime' }],
})

console.log(message.content[0].text)

Tool Calling vẫn hoạt động

import anthropic

client = anthropic.Anthropic(
    base_url='http://localhost:11434',
    api_key='ollama',
)

message = client.messages.create(
    model='qwen3-coder',
    tools=[
        {
            'name': 'get_weather',
            'description': 'Get the current weather in a location',
            'input_schema': {
                'type': 'object',
                'properties': {
                    'location': {
                        'type': 'string',
                        'description': 'The city and state, e.g. San Francisco, CA'
                    }
                },
                'required': ['location']
            }
        }
    ],
    messages=[{'role': 'user', 'content': "What's the weather in San Francisco?"}]
)

for block in message.content:
    if block.type == 'tool_use':
        print(f'Tool: {block.name}')
        print(f'Input: {block.input}')

Các tính năng được hỗ trợ

Messages và multi-turn conversations
Streaming
System prompts
Tool calling / function calling
Extended thinking
Vision (image input)

Việc chạy Claude Code với local models mở ra nhiều khả năng: coding assistant riêng tư, làm việc offline, và thử nghiệm nhiều model khác nhau mà không cần thay đổi workflow.

Bạn sẽ thử model nào đầu tiên?