AI-ML

Ollama v0.19.0

AI-ML27 de marzo de 2026Impact 27Anuncio oficial

RESUMEN

Ollama is now powered by MLX on Apple Silicon in preview Ollama on Apple silicon is now built on top of Apple’s machine learning framework, ML

Descripción Detallada

Ollama is now powered by MLX on Apple Silicon in preview Ollama on Apple silicon is now built on top of Apple’s machine learning framework, MLX, to take advantage of its unified memory architecture. Read more: What's Changed Ollama's app will now no longer incorrectly show "model is out of date" `ollama launch pi` now includes web search plugin that uses Ollama's web search Improved KV cache hit rate when using the Anthropic-compatible API Fixed tool call parsing issue with Qwen3.5 where tool calls would be output in thinking MLX runner will now create periodic snapshots during prompt processing Fixed KV cache snapshot memory leak in MLX runner Fixed issue where flash attention would be incorrectly enabled for `grok` models Fixed `qwen3-next:80b` not loading in Ollama New Contributors * @amatas made their first contribution in Full Changelog:

Resumen editorial · IA

Nada urgente— No hay cambios que rompan código.

Ollama v0.19.0 mejora la integración con MLX en Apple Silicon y corrige varios errores.

Ollama ya no muestra incorrectamente 'modelo desactualizado'.
El comando `ollama launch pi` ahora incluye un plugin de búsqueda web.
Mejorada la tasa de aciertos de la caché KV con la API compatible con Anthropic.
Se corrigieron varios errores relacionados con la carga de modelos y fugas de memoria.

A quién le importa

Todos los que usan Ollama en Apple Silicon.

Generado por IA · puede contener errores

ai-ml

Releases Relacionados

AI-ML

Ollama v0.32.5

## What's Changed * Fixed an MLX Metal bug that could reduce output quality for NVFP4 models, particularly Laguna. **Full Changelog**: https://github.com/ollama/ollama/compare/v0.32.4...v0.32.5

hace 6d20

AI-ML

Ollama v0.32.4

## What's Changed - Support Laguna on Apple GPUs via the MLX engine - Quantize draft-model output heads at the requested type when creating speculative-decoding drafts. - Fixed Qwen3 MoE decoding for differently-quantized experts, plus faster packed gate/up projection (~4–9% on M5 Max). **Full

hace 1sem20

AI-ML

Ollama v0.19.0

Descripción Detallada

Releases Relacionados

Ollama v0.32.5

Ollama v0.32.4

Ollama v0.32.3

Ollama v0.32.2