Find similar content

Source channel @githubtrending · Post #15521 · Feb 25

#rust#ai_gateway#ai_gateway_support#envoy#envoyproxy#gateway#generative_ai#llm_gateway#llm_inference#llm_proxy#llm_routing#llmops#llms#openai#prompt#proxy#proxy_server#routing Plano is an AI-native proxy server that handles key tasks for agentic apps like routing between agents, smart LLM model selection, safety guardrails, and automatic traces for observability. Define agents in simple YAML, write basic HTTP code in any language, and start Plano to run multi-agent systems without custom plumbing or framework lock-in. You benefit by building and shipping reliable agents to production much faster, focusing on core logic while gaining safety, low latency, and easy scaling. https://github.com/katanemo/plano

Hashtags

#rust #ai_gateway #ai_gateway_support #envoy #envoyproxy #gateway #generative_ai #llm_gateway #llm_inference #llm_proxy #llm_routing #llmops #llms #openai #prompt #proxy #proxy_server #routing

Results

1 similar post found

Search: #localllama

当前筛选 #localllama清除筛选

是芙莉莲

@ireallyhatetheworld · Post #1459 · 03/16/2026, 01:23 PM

Find similar View

Qwen3.5-9B-Claude-4.6-Opus-Uncensored-Distilled-GGUF: 面向本地部署的轻量级创意与推理模型 🔞可用于本地涩涩等场景 • 基于 Qwen 3.5 9B，并融入 Claude Opus 4.6 蒸馏思路，主打更强的创意表达、对话表现与角色扮演场景 • 提供 GGUF 与低显存友好的 Q4_K_M 量化版本，作者反馈在 RTX 3060 12 GB 上可达约 38 tok/s，适合本地聊天、游戏 NPC 与 Home Lab 部署 • 默认关闭 thinking 以提升通用聊天体验，需要时可在 LM Studio 中手动开启；模型采用 Apache 2.0 许可证，便于社区测试与二次集成 https://www.reddit.com/r/LocalLLaMA/comments/1runlpf/qwen359bclaude46opusuncensoreddistilledgguf #AI#Uncensored#本地大模型#模型蒸馏#GGUF#Qwen#Claude#LMStudio#量化模型#低显存部署#角色扮演#LocalLLaMA

Hashtags

#ai #uncensored #本地大模型 #模型蒸馏 #gguf #qwen #claude #lmstudio #量化模型 #低显存部署 #角色扮演 #localllama