TGTGInsightintelligence telegramLIVE / telegram public index
Contenuto del post
Contenuto
Hugging Face (Twitter) RT @victormustar: 🎉 llama.cpp now has Ollama-style model management. • Auto-discover GGUFs from cache • Load on first request • Each model runs in its own process • Route by `model` (OpenAI-compatible API) • LRU unload at `--models-max` https://huggingface.co/blog/ggml-org/model-management-in-llamacpp