Ollama

IA / ML

Ollama tuvo 63 releases en los últimos 6 meses, incluyendo 10 breaking changes.

Total Releases

Últimos 6 meses

10 breaking en total

Frecuencia

—

Promedio entre releases

Impact Score Promedio

Último release:v0.32.5(27 de julio de 2026)

Timeline de Releases

202663 releases

Ollama

v0.32.520

What's Changed * Fixed an MLX Metal bug that could reduce output quality for NVFP4 models, particularly Laguna. Full Changelog:

27 de julio de 2026

Ollama

v0.32.420

What's Changed - Support Laguna on Apple GPUs via the MLX engine - Quantize draft-model output heads at the requested type when creating speculative-decoding drafts. - Fixed Qwen3 MoE decoding for dif…

25 de julio de 2026

Ollama

v0.32.327

What's Changed - Fixed model downloads that stall before sending data. - Improved integrations: restored Claude Code Channels, fixed Anthropic thinking streams, and made Hermes Desktop respect `--forc…

23 de julio de 2026

Ollama

v0.32.220

Withdrawn, please use 0.32.3 or newer

20 de julio de 2026

Ollama

v0.32.1Breaking30

What's Changed - Improved Gemma 4 tool calling and multi-turn reasoning, including more reliable tool-response continuations - Fixed a recurrent MLX model cache leak that could increase memory use acr…

16 de julio de 2026

Ollama

v0.32.0Breaking40

What's Changed - New interactive agent experience: running `ollama` now launches an agent to help you code and delegate work ``` ❯ ollama Ollama 0.32.0 ▸ Chat, Code, & Work (glm-5.2:cloud) Chat with m…

11 de julio de 2026

Ollama

v0.31.220

What's Changed Enabled flash attention on older NVIDIA GPUs (compute capability 6.x) iGPU can now offload vision models with padding to fit available memory Fixed structured output for thinking models…

6 de julio de 2026

Ollama

v0.31.120

Faster Gemma 4 on Apple Silicon Gemma 4 is now significantly faster in Ollama on Apple Silicon, generating tokens nearly 90% fast

30 de junio de 2026

Ollama

v0.30.1220

What's Changed tools: ignore braces inside JSON strings when detecting tool call end by @aditya-786 in mlx: bump dependency by @dhiltgen in * llama.cpp update by @dhiltgen in

29 de junio de 2026

Ollama

v0.30.1120

What's Changed launch: add thinking capability detection to opencode by @hoyyeva in launch: auto-install Claude Code by @hoyyeva in * launch: auto-install opencode when missing by @hoyyeva in

25 de junio de 2026

Ollama

v0.30.1020

What's Changed Command A and North family models now run on Apple Silicon with the MLX engine Updated the underlying llama.cpp engine to build 9672 * Fixed build artifacts for MLX Full Changelog:

17 de junio de 2026

Ollama

v0.30.920

What's Changed Support for Cohere2Moe architecture Fixed LFM2 parser/render for cases where thinking was not emitted Fixed issue where `ollama launch claude` and other coding agent or assistant use ca…

15 de junio de 2026

Ollama

v0.30.820

What's Changed Fixed `ollama launch` selecting the wrong provider in some cases Improved prompt caching by decoupling it from context shift for better KV cache reuse More stable MLX inference with har…

12 de junio de 2026

Ollama

v0.30.720

Ollama Launch now supports Hermes Desktop, a native desktop interface for the Hermes agent. Run it alongside your Hermes agent to get a visual interface for managing conversations, integrations, and m…

7 de junio de 2026

Ollama

v0.30.620

New models - Gemma 4 QAT weights: the Gemma 4 family is now optimized with Quantization-Aware Training (QAT) to dramatically reduce memory requirements and maximize on-device performance. Look for the…

5 de junio de 2026

Ollama

v0.30.520

What's Changed Fixed the `gemma4:12b` floating point exception crash on x86, CUDA, Linux, and Windows systems. `ollama launch hermes-desktop` now launches Hermes Desktop and can skip rebuilding when a…

4 de junio de 2026

Ollama

v0.30.420

New models - Nemotron-3-Ultra: NVIDIA Nemotron 3 Ultra is built for high-throughput reasoning and long-running agent workflows. What's Changed * Fixed multimodal models not using GPU on the llama.cpp…

3 de junio de 2026

Ollama

v0.30.317

New models - Gemma 4 12B: high-performance multimodal intelligence that runs directly on laptops, combining efficiency with advanced reasoning. What's Changed * Added support for `gemma4:12b`. Full Ch…

3 de junio de 2026

Ollama

v0.30.217

What's Changed `ollama launch` now supports Qwen Code and can guide users through installing the Cline CLI when it is missing. `ollama launch codex` now uses an isolated launch configuration, avoiding…

3 de junio de 2026

Ollama

v0.30.120

What's Changed feat(launch): show and auto-install Cline CLI by @hoyyeva in log template details to aid troubleshooting by @dhiltgen in * cmd/launch: add Qwen code integration by @hoyyeva in

2 de junio de 2026

Ollama

v0.24.0Breaking40

Codex App Ollama 0.24 includes support for the Codex App, OpenAI's desktop experience for working on Codex threads in parallel with built-in worktree support and git functionality. ```bash ollama laun…

14 de mayo de 2026

Ollama

v0.23.420

What's Changed `ollama launch opencode` now supports vision models with image inputs Fixed formatting of Claude tool results when using local image paths Full Changelog:

13 de mayo de 2026

Ollama

v0.30.030

Ollama 0.30 is now available, with improved compatibility and performance using llama.cpp. This augments the MLX engine on Apple Silicon, bringing support to a wider range of hardware. This release br…

13 de mayo de 2026

Ollama

v0.23.3Breaking30

What's Changed mlx: refined model push behavior by @dhiltgen in test: integration test hardening by @dhiltgen in app: harden update flows by @dhiltgen in

12 de mayo de 2026

Ollama

v0.23.220

What's Changed `ollama launch` no longer includes Claude Desktop due to the third-party integration being limited to Anthropic models. Use `ollama launch claude-desktop --restore` to restore Claude De…

7 de mayo de 2026

Ollama

v0.23.1Breaking30

Gemma 4 MTP (Multi-token Processing) for the MLX runner Gemma 4 MTP speculative decoding is now supported on Macs. This can give over a 2x speed increase for the Gemma 4 31B model on coding tasks. ```…

5 de mayo de 2026

Ollama

v0.23.0Breaking40

Claude Desktop Claude Desktop is now supported with Ollama Launch. Claude Cowork and Claude Code are supported within the Claude Desktop App. ``` ollama launch claude-desktop ``` Claude Cowork <img wi…

3 de mayo de 2026

Ollama

v0.22.1Breaking30

What's Changed Updated the Gemma 4 renderer for thinking and tool calling improvements Model recommendations are now updated without updating Ollama Aligned the desktop app's launch page with `ollama…

28 de abril de 2026

Ollama

v0.22.0Breaking40

New models NVIDIA's Nemotron 3 Omni Poolside's first open-weight coding model - Laguna XS.2 Full Changelog:

28 de abril de 2026

Ollama

v0.21.320

What's Changed api: accept "max" as a think value by @ParthSareen in openai: map responses reasoning effort to think by @ParthSareen in Full Changelog:

24 de abril de 2026

Ollama

v0.21.217

What's Changed Improved reliability of the OpenClaw onboarding flow in `ollama launch` Recommended models in `ollama launch` now appear in a fixed, canonical order OpenClaw integration now bundles Oll…

23 de abril de 2026

Ollama

v0.21.117

What's Changed Kimi CLI You can now install and run the Kimi CLI through Ollama. ``` ollama launch kimi --model kimi-k2.6:cloud ``` Kimi CLI with Kimi K2.6 excels at long horizon agentic execution tas…

22 de abril de 2026

Ollama

v0.21.0Breaking37

Hermes Agent ``` ollama launch hermes ``` Hermes learns with you, automatically creating skills to better serve your workflows. Great for research and engineering tasks. <img width="1329" height="946"…

16 de abril de 2026

Ollama

v0.20.8Breaking27

What's Changed ROCm: Update to ROCm 7.2.1 on Linux by @saman-amd in gemma4: fix nothink case renderer by @drifkin in * gemma4: fix compiler error on metal by @dhiltgen in

14 de abril de 2026

Ollama

v0.20.717

What's Changed Fix quality of gemma:e2b and gemma:e4b when thinking is disabled ROCm: Update to ROCm 7.2.1 on Linux by @saman-amd in Full Changelog:

13 de abril de 2026

Ollama

v0.20.617

What's Changed Gemma 4 tool calling ability is improved and updated to use Google's latest post-launch fixes Parallel tool calling improved for streaming responses Hermes agent Ollama integration guid…

10 de abril de 2026

Ollama

v0.20.517

OpenClaw channel setup with `ollama launch` What's Changed - OpenClaw channel setup: connect WhatsApp, Telegram, Discord, and other messaging channels thro

9 de abril de 2026

Ollama

v0.20.417

What's Changed mlx: Improve M5 performance with NAX gemma4: enable flash attention Full Changelog:

7 de abril de 2026

Ollama

v0.20.317

What's Changed Gemma 4 Tool Calling improvements Added latest models to Ollama App * OpenClaw fixes for launching TUI Full Changelog:

7 de abril de 2026

Ollama

v0.20.217

What's Changed * app: default app home view to new chat instead of launch by @jmorganca in Full Changelog:

4 de abril de 2026

Ollama

v0.20.117

What's Changed bench: add prompt calibration, context size flag, and NumCtx reporting by @dhiltgen in model/parsers: fix gemma4 arg parsing when quoted strings contain " by @drifkin in * ggml: skip cu…

3 de abril de 2026

Ollama

v0.20.027

Gemma 4 Effective 2B (E2B) ``` ollama run gemma4:e2b ``` Effective 4B (E4B) ``` ollama run gemma4:e4b ``` **26B (Mixture of Experts mod

2 de abril de 2026

Ollama

v0.19.027

Ollama is now powered by MLX on Apple Silicon in preview Ollama on Apple silicon is now built on top of Apple’s machine learning framework, ML

27 de marzo de 2026

Ollama

v0.18.424

What's Changed ggml: force flash attention off for grok by @rick-github in mlx: fix KV cache snapshot memory leak by @jessegross in * mlxrunner: schedule periodic snapshots during prefill by @jessegro…

26 de marzo de 2026

Ollama

v0.18.317

Visual Studio Code Microsoft Visual Studio Code now directly integrates with Ollama via GitHub Copilot. If you have Ollama installed, any local or cloud model from Ollama can be selected for use withi…

25 de marzo de 2026

Ollama

v0.18.217

What's Changed Add extra check to ensure `npm` and `git` are installed before installing OpenClaw Claude Code will now be faster when run locally, due to preventing cache breakages Fix to correctly su…

18 de marzo de 2026

Ollama

v0.18.124

Web Search and Fetch in OpenClaw Ollama now ships with web search and web fetch plugin for OpenClaw. This allows Ollama's models (local or cloud) to search the web for the latest content and news. Thi…

17 de marzo de 2026

Ollama

v0.18.027

Ollama 0.18 includes improved performance for OpenClaw and Ollama’s cloud models, including the new Nemotron-3-Super model by NVIDIA designed for high-performance agentic reasoning tasks. Improved Ope…

14 de marzo de 2026

Ollama

v0.17.820

What's Changed parsers: repair unclosed arg_value tags in GLM tool calls by @BruceMacD in Reapply "don't require pulling stubs for cloud models" again by @jmorganca in * docs: format compat docs by @m…

10 de marzo de 2026

Ollama

v0.17.720

What's Changed Allow thinking levels such as `"medium"` to correctly interpreted in Ollama's API for all thinking models Add context length to support compaction when using `ollama launch` Full Change…

5 de marzo de 2026

Ollama

v0.17.620

What's Changed Fixed issue where GLM-OCR would not work due to incorrect prompt rendering Fixed tool calling parsing and rendering for Qwen 3.5 models New Contributors * @Victor-Quqi made their first…

4 de marzo de 2026

Ollama

v0.17.520

New models - Qwen3.5: the small Qwen 3.5 model series is now available in 0.8B, 2B, 4B and 9B parameter sizes. What's Changed Fixed crash in Qwen 3.5 models when split over GPU & CPU Fixed issue where…

2 de marzo de 2026

Ollama

v0.17.424

New models - Qwen 3.5: a family of open-source multimodal models that delivers exceptional utility and performance. - LFM 2: LFM2 is a family of hybrid models designed for on-device deployment. LFM2-2…

27 de febrero de 2026

Ollama

v0.17.317

What's Changed * Fixed issue where tool calls in the Qwen 3 and Qwen 3.5 model families would not be parsed correctly if emitted during thinking Full Changelog:

27 de febrero de 2026

Ollama

v0.17.217

What's Changed * Fixed issue where Ollama's app on Windows would crash when a new update has been downloaded Full Changelog:

26 de febrero de 2026

Ollama

v0.17.117

What's Changed Nemotron architecture support in Ollama's engine MLX engine now has improved memory usage Ollama's app will now allow models that support tools to use web search capabilities Improved L…

24 de febrero de 2026

Ollama

v0.17.027

OpenClaw OpenClaw can now be installed and configured automatically via Ollama, making it the easiest way to get up and running with OpenClaw with open models like Kimi-K2.5, GLM-5, and Minimax-M2.5.…

21 de febrero de 2026

Ollama

v0.16.317

What's Changed New `ollama launch cline` added for the Cline CLI `ollama launch ` will now always show the model picker Added Gemma 3, Llama and Qwen 3 architectures to MLX runner New Contributors @he…

19 de febrero de 2026

Ollama

v0.16.217

What's Changed `ollama launch claude` now supports searching the web when using `:cloud` models Fixed rendering issue when running `ollama` in PowerShell * New setting in Ollama's app makes it easier…

14 de febrero de 2026

Ollama

v0.16.117

What's Changed Installing Ollama via the `curl` install script on macOS will now only prompt for your password if its required Installing Ollama via the `iem` install script in Windows will now show p…

12 de febrero de 2026

Ollama

v0.16.027

New models GLM-5: A strong reasoning and agentic model from Z.ai with 744B total parameters (40B active), built for complex systems engineering and long-horizon tasks. MiniMax-M2.5: a new state-of-the…

12 de febrero de 2026

Ollama

v0.15.617

What's Changed Fixed context limits when running `ollama launch droid` `ollama launch` will now download missing models instead of erroring * Fixed bug where `ollama launch claude` would cause context…

7 de febrero de 2026

Ollama

v0.15.520

New models - Qwen3-Coder-Next: a coding-focused language model from Alibaba's Qwen team, optimized for agentic coding workflows and local development. - GLM-OCR: GLM-OCR is a multimodal OCR model for…

3 de febrero de 2026