Ollama just made it possible for you to use their models inside of Claude Code.

To test Claude Code capabilities and compare top open models with closed ones, I compared GLM 4.7 and Opus 4.5 on a task that requires attention to details and debugging skills.
The TL;DR is that Opus 4.5 outperformed GLM 4.7 by (what I consider) a large margin. This makes GLM 4.7 seem useless but it’s not.
How Ollama Made It Possible
The task of doing this is more tedious than difficult.
Claude Code, as an agent, just makes calls to the Anthropic servers to get the model’s responses. To get this working with any other API is a matter of implementing a proxy that translates the calls from Claude Code to the target API.

I’m not saying that this work has no value. On the contrary, it’s great work that requires attention and knowledge of both systems. But it’s not something that only Ollama could do. claude-code-proxy has done the same thing.
The Experiment
I’ve been using OpenCode and Claude Code for a while now. I enjoy OpenCode’s flexibility, and they have a very ergonomic TUI. Better than Claude Code, to be honest.
But Claude Code always got me curious. I know that Opus is a great model, every person who tried it for coding tasks agrees on that.
However, I had the idea that Claude Code played a part in this greatness. Now that Ollama made it possible, it’s a good chance to test the Agent isolated from the model.
I then decided to compare Opus 4.5 vs GLM 4.7 inside Claude Code.
Setup Claude Code with Ollama
With the latest Ollama version, it’s very simple to setup Claude Code to use Ollama models. You must set two environment variables: ANTHROPIC_BASE_URL and ANTHROPIC_AUTH_TOKEN.
The first one must point to your local Ollama server, usually http://localhost:11434/v1. The second can be anything actually.
The step-by-step guide is:
- Install Ollama
- Start the Ollama server with
ollama serve - Pull the model that you want to use, e.g.
ollama pull glm-4.7:cloud - Start Claude Code with the environment variables set:
ANTHROPIC_BASE_URL=http://localhost:11434 ANTHROPIC_AUTH_TOKEN=ollama claude --model glm-4.7:cloud
I’m using GLM 4.7 from the cloud because I don’t have enough hardware to run it locally. If you have a powerful GPU you can run the GLM 4.7 Flash model locally.
The Task
There’s a common misconception. Since people tend to use Claude Code + Claude models, they think they are the same thing. Well, they aren’t. Claude Code is just a frontend that makes calls to an API that performs the AI inference.
It means we can have any backend API with whatever the model is. It’s a matter of switching the backend while making it compatible with the frontend.
With that in mind, I decided that I wanted to see Claude Code being backed by Gemini 3. But of course, it’s 2026, I didn’t want to write code.
I decided that this would be the perfect task to test both models:
Ollama just implemented an adapter for Claude Code
https://github.com/ollama/ollama/blob/main/anthropic/anthropic.go
It acts as a proxy, requests are sent to Ollama API instead of anthropic server.
I want you to implement a proxy like this to Vertex API, but it should be in Node.js.
Make direct calls as in the following curl. Don't install any library for calling Vertex api.
curl "https://aiplatform.googleapis.com/v1/publishers/google/models/gemini-3-pro-preview:streamGenerateContent?key=YOUR_API_KEY" \
-X POST \
-H "Content-Type: application/json" \
-d '{
"contents": [
{
"role": "user",
"parts": [
{
"text": "Explain how AI works in a few words"
}
]
}
]
}'
I will provide the API key through CLI. It should get a server running. Claude Code will be pointed to it.This is a simple prompt. I give the reference to the Ollama implementation, the curl to call Vertex API and I just want a Node.js server that acts as a proxy.
A good outcome should give me a working proxy server that could talk to Claude Code. It doesn’t need to be perfect, but it should be functional.
Results
From syntax errors to wrong implementations, both models got a working solution. One of them, however, was way more effective.
Let’s see the timeline of both attempts, starting with GLM 4.7.
GLM 4.7
I gave the prompt, and one minute later, I got an implementation. I got the server running, no programming errors but Claude Code was unable to get a response from the server. I sent a "hello" to it and got no response.
The agent’s next step was to debug. It added console logs (as every good Node.js developer does), and I tested again.
Things were looking better, a response was coming back to Claude Code, but it was an error from Vertex API.
❯ API Error: 400 {"error":{"type":"api_error","message":"Vertex AI returned 400: [{\n \"error\": {\n \"code\": 400,\n \"message\": \"Invalid JSON payload received. Unknown name \\\"text\\\" at 'system_instruction.parts[0]': Proto field is not repeating, cannot start list.\\nInvalid JSON payload received. Unknown name \\\"$schema\\\" at 'tools[0].function_declarations[0].parameters': Cannot find field.\\nInvalid JSON payload received. Unknown name \\\"exclusiveMinimum\\\" at The issue was found, it was due to a mismatch between the JSON sent to Anthropic and the one expected by Vertex API.
⏺ I see the issues. Vertex AI has two main problems:
1. System instruction format - The parts field structure is different for system instructions
2. Tool schema conversion - Anthropic uses full JSON Schema (with $schema, exclusiveMinimum, propertyNames) but Vertex AI only supports a subsetIt claimed it was fixed, I tried again and got a 400 error again. I wanted to help, but it was hard to understand the logs. There was too much information. It was printing a whole JSON payload that was too long.
The model then "improved" it by truncating the logs. It helped. I could see the proxy was making calls using the wrong model name. It should be using the model name that comes from the request, not a hardcoded one.
⏺ Done. The logging is now much cleaner:
- Truncates long strings (>200 chars) and large objects (>500 chars)
- Hides tool parameters and schema (just shows tool names)
- Simplified request logging (just shows method/path)
- Removed verbose debug logs for headers, body content, and Vertex candidatesThe error persisted for the next two attempts. It was doing some replacement on top of the model that was coming in the request. I have no idea why.
I then suggested it to just use the model from the request.
❯ theres no need to replace anything
just do this
const model = anthropicReq.model;Claude Code uses multiple models
Claude Code makes multiple calls to models when you interact with it, and it is clever enough to route requests to different models when needed.
Since in my tests I was sending a simple "hello" prompt, it was using a smaller model for that. That’s great, there’s no need to use a big and expensive model for simple tasks.
Back to GLM
It broke again. The proxy was sending requests to Vertex with the model named claude-haiku-4-5-20251001.
I requested a mapping from Claude models to Vertex models. The pinpointing continued, and by looking at the logs I was pretty sure everything was fine with the communication to Vertex.
Unfortunately, the model was unable to figure out why I was getting no response inside of Claude Code.
As a last chance to GLM, I asked it to check the implementation Opus had done and see if there was anything different.
It then fixed the issues. I still got an error related to the incompatibility of the JSON schema sent to Vertex. The model then commented it out and sent a minimal payload to Vertex.
I was able to get a response from Vertex inside of Claude Code, but it was not what I expected. It was just like I was talking to Gemini directly.

It’s not incorporating the Claude Code system instructions. It was just a proxy to Gemini.
Claude Opus 4.5
You saw from the first experiment that Opus was able to implement a functional proxy. The code produced was indeed used as a reference for GLM to fix its implementation.
I gave the prompt to Opus, and one minute later, I had a first implementation. Also, with some issues like:
- model not being taken from the request;
- no response coming to Claude Code;
- wrong endpoint being used (generativelanguage.googleapis.com instead of aiplatform.googleapis.com);
- schema incompatibility; claude-haiku being used as model name;
- and a syntax error due to a missing parenthesis.
All these issues were similar to the ones GLM faced. Opus however, was way more effective at fixing them.
It got in the same situation as with GLM: no response coming to Claude Code. I passed the logs to it and it quickly understood what was going on.
⏺ Now I see it. Vertex AI returns a JSON array [{...}, {...}] with pretty-printing, not SSE or newline-delimited JSON. I need to collect the full response then parse it.And it worked as expected. I got a response from Vertex inside of Claude Code.

Conclusion
Yes, Claude Opus 4.5 is impressive, and yes, it outperformed GLM 4.7 in this experiment. But I’m not surprised.
Some takeaways:
- Both models faced similar issues. It’s clear that the Claude Code agent has a consistent way of prompting the models and it’s pretty good at it.
- Opus 4.5 was way more effective at overcoming the issues. It required less interactions to get to a working solution.
- Opus 4.5 is much better at debugging. This skill, vital for developers, is also present in Opus (I myself used it to debug some segmentation faults).
- Opus 4.5 code is cleaner and more idiomatic.
It sounds like GLM 4.7 is useless, but that’s not true. While it struggled with this specific cross-protocol proxy task, it remains a highly capable model for standard development work. For routine CRUD operations, CSS tweaks, or boilerplate generation, GLM 4.7 is more than sufficient.
Furthermore, if you have the hardware to run it locally, it becomes a powerful cost-saving tool. You can offload simpler, high-volume tasks to your local GLM instance to save your Claude API credits for complex debugging sessions where Opus 4.5 is needed.
I didn’t do any other experimentation with GLM on this matter ,but I theorize that if I give it specs of Claude Code calls, Vertex API, and expected behavior, it could do a better job.
Claude Opus 4.5 can be your choice for any coding related tasks. It’s the top 1 model for coding right now. I also heard great things about GPT 5.2 with Codex.
On open models, GLM 4.7 is way better than Qwen3-Coder that I tested before. But it’s still not enough to compete with closed models in complex coding tasks.
We want to work with you. Check out our Services page!

