Post recenti
Pag. 27 di 85 · 1,011 post
Pubblicato 15 dic
Hugging Face (Twitter) RT @donvito: No Gemma 4 yet so I went through Google’s @huggingface Discovered this, wow MedGemma is a collection of Gemma 3 variants that are trained for performance on medical text and image comprehension Developers can use MedGemma to accelerate building healthcare-based AI applications
Pubblicato 15 dic
Hugging Face (Twitter) RT @allen_ai: Introducing Bolmo, a new family of byte-level language models built by "byteifying" our open Olmo 3—and to our knowledge, the first fully open byte-level LM to match or surpass SOTA subword models across a wide range of tasks. 🧵
Pubblicato 15 dic
Hugging Face (Twitter) RT @JianZhangCS: 🚀@Nvidia Nemotron 3 Nano is live! Nemotron 3 Nano is the world's most efficient open MoE with an Hybrid-MoE architecture and 1M context length. 🔥 Strong in reasoning, agentic and chat tasks with leading accuracy among AA index, Tau2, SWE Bench. 🔥 Up to 3.3X higher throughput comparing to other open MoE at similar sizes 🔥 A fully open recipe with data, infra released to the community Checkout the new model architecture and reinforcement learning technologies we used below: 😊 Huggingface: https://huggingface.co/collections/nvidia/nvidia-nemotron-v3 📢 Research blog: nvda.ws/48RusVt 🛣️Nemo RL & Nemo Gym (RL environment orchestration): github.com/NVIDIA-NeMo/RL & github.com/NVIDIA-NeMo/Gym Kudos to the teams for months of hard work! We are excited to keep building the Nemotron 3 model family and empower the community.
Pubblicato 15 dic
Hugging Face (Twitter) RT @ArtificialAnlys: NVIDIA has just released Nemotron 3 Nano, a ~30B MoE model that scores 52 on the Artificial Analysis Intelligence Index with just ~3B active parameters Hybrid Mamba-Transformer architecture: Nemotron 3 Nano combines the hybrid Mamba-Transformer approach @NVIDIAAI has used on previous Nemotron models with a moderate-sparsity MoE architecture, enabling highly efficient inference, particularly at longer sequence lengths Small-model improvements: with 31.6B total and 3.6B active parameters, Nemotron 3 Nano scores 52 on our Intelligence Index, in line with OpenAI’s gpt-oss-20b (high). This represents a +6 point lead on the similarly-sized Qwen3 30B A3B 2507 and +15 improvement on NVIDIA’s previous Nemotron Nano 9B V2 (a dense model) High openness: Nemotron 3 Nano follows other recent NVIDIA models in open licensing and releases of data and methodology for the community to use and replicate - it scores an 67 on the Artificial Analysis... Перейти на оригинальный пост
Pubblicato 15 dic
Hugging Face (Twitter) RT @testingcatalog: Google is preparing for a new open source release on @huggingface Also noticed just recently that Gemma models are not available on AI Studio anymore. What do you expect? 👀https://twitter.com/osanseviero/status/2000493503860892049#m
Pubblicato 15 dic
Hugging Face (Twitter) RT @_akhaliq: Apple just released Sharp Sharp Monocular View Synthesis in Less Than a Second huggingface.co/apple/Sharp
Pubblicato 15 dic
Hugging Face (Twitter) RT @gabriberton: The term VLM has two related but very different meanings and it's so confusing 1) CLIP-like VLMs: 2 encoders trained from scratch 2) Llava-like VLMs: a vision encoder attached to an LLM, both pretrained Ugly image generated with nano banana of course
Pubblicato 15 dic
Hugging Face (Twitter) RT @mervenoyann: IBM dropped CUGA, open-source enterprise agent to automate boring tasks 🔥 > given workspace files, it writes and executes code to accomplish any task 🤯 > comes with a ton of tools built for enterprise tasks, supports MCPs > plug in your favorite LLM 👏 here's a small demo where it retrieves info from a file, calculates revenue by writing code, and drafts an e-mail 🤯 they release code, a blog and a demo 🙌🏻 you can run this locally
Pubblicato 15 dic
Hugging Face (Twitter) RT @nvidianewsroom: NEWS: NVIDIA announces the NVIDIA Nemotron 3 family of open models, data, and libraries, offering a transparent and efficient foundation for building specialized agentic AI across industries. Nemotron 3 features a hybrid mixture-of-experts (MoE) architecture and new open Nemotron pretraining and post-training datasets, paired with NeMo Gym, an open-source reinforcement learning library that enables scalable, verifiable agent training. Read more: nvda.ws/4oNUTBm
Pubblicato 15 dic
Hugging Face (Twitter) RT @NVIDIAAIDev: ✨ Meet our new open family of models: @NVIDIA Nemotron 3 Open in weights, data, tools, and training, Nemotron 3 is built for multi-agent apps and features: • An efficient hybrid Mamba‑Transformer MoE architecture • 1M token context for long-term memory and improved reasoning • Multi‑environment reinforcement learning via NeMo Gym for advanced skill adaptation Plus NVFP4 pre-training, latent MoE, 1T tokens of data, and more. 📗Read the details in our tech blog: nvda.ws/4565FMe 🤗 Try the model on @huggingface: nvda.ws/3Ytkx3z
Pubblicato 15 dic
Hugging Face (Twitter) RT @ctnzr: Today, @nvidia is launching the open Nemotron 3 model family, starting with Nano (30B-3A), which pushes the frontier of accuracy and inference efficiency with a novel hybrid SSM Mixture of Experts architecture. Super and Ultra are coming in the next few months.
Pubblicato 15 dic
Hugging Face (Twitter) RT @_philschmid: 👀👀👀