Post recenti
Pag. 75 di 85 · 1,011 post
Pubblicato 24 set
Hugging Face (Twitter) RT @ClementDelangue: Granite Docling by @IBM is #3 trending on @huggingface. This is a multimodal Image-Text-to-Text model engineered for efficient document conversion. It preserves the core features of Docling while maintaining seamless integration with DoclingDocuments to ensure full compatibility. It builds upon the IDEFICS3 architecture, but introduces two key modifications: it replaces the vision encoder with siglip2-base-patch16-512 and substitutes the language model with a Granite 165M LLM. Try out our Granite-Docling-258 demo today. License: Apache 2.0 Granite-docling-258M is fully integrated into the Docling pipelines, carrying over existing features while introducing a number of powerful new features, including: 🔢 Enhanced Equation Recognition: More accurate detection and formatting of mathematical formulas 🧩 Flexible Inference Modes: Choose between full-page inference, bbox-guided region inference 🧘 Improved Stability: Tends to avoid... Перейти на оригинальный пост
Pubblicato 24 set
Hugging Face (Twitter) RT @ClementDelangue: Xet by Hugging Face is the most important AI technology that nobody is talking about! Under the hood, it now powers 5M Xet-enabled AI models & datasets on HF which see hundreds of terabytes of uploads and downloads every single day. What makes it super powerful is that it massively speeds up & reduces costs of data transfer thanks to methods like content-defined chunking (CDC). Instead of treating a file as an indivisible unit, CDC breaks files down into variable-sized chunks, using the data to define boundaries. That's what allows @huggingface to offer a platform for 10 million AI builders in open-source at a fraction of the cost. Thanks @xetdata team!
Pubblicato 24 set
Hugging Face (Twitter) RT @abidlabs: I'm interested in hiring a python engineer who knows @Gradio well & likes experimenting with many different projects simultaneously and growing the ones that are the most impactful. DM if you'd like to work with me @huggingface, and share your most impressive Gradio app.
Pubblicato 22 set
Hugging Face (Twitter) RT @mervenoyann: this summer we have shipped a ton of things in TRL! 🔥🏖️👒 try out bleeding-edge fine-tuning methods with few lines of CLI commands and check out notebooks to get started 🤠
Pubblicato 22 set
Hugging Face (Twitter) RT @Baidu_Inc: Qianfan-VL, Baidu AI Cloud's vision-language model series, is now open source! Designed for enterprise-level applications, these multimodal models combine robust general capabilities with advanced performance in OCR and math problem-solving. Key features: > Three model sizes (3B, 8B, 70B) with 32K context length for diverse needs > Chain-of-thought reasoning in 8B/70B for strong performance in chart understanding, math, and visual logic > Four-stage progressive training pipeline for improved cross-modal alignment and domain enhancement > High-precision data synthesis pipeline across documents, math, charts, tables, formulas, and OCR tasks Discover more about Qianfan-VL ↓
Pubblicato 22 set
Hugging Face (Twitter) RT @LeRobotHF: LeRobot SO101 setup just got 50% cheaper! You can now teleoperate your follower arm right from your phone. 🤯 But that's not all. Our new pipeline feature lets you record and train AI models in end-effector space, or with any other features. The possibilities are endless!
Pubblicato 22 set
Hugging Face (Twitter) RT @ClementDelangue: New version of Reachy Mini close to 360 view!
Pubblicato 22 set
Hugging Face (Twitter) RT @XiaomiMiMo: 👋 Say Hi to MiMo-Audio! Our BREAKTHROUGH in general-purpose audio intelligence. 🎯 Scaling pretraining to 100M+ hours leads to EMERGENCE of few-shot generalization across diverse audio tasks! 🔥 Post-trained MiMo-Audio-7B-Instruct: • crushes benchmarks: SOTA on MMSU, MMAU, MMAR, MMAU-Pro • outperforms Gemini-2.5-Flash on audio understanding • beats GPT-4o-Audio on complex reasoning tasks 💎 The best part? It's 100% OPEN-SOURCE Everything from tokenizer to model to evaluations! 🤗 Try it in HF Space: https://huggingface.co/spaces/XiaomiMiMo/mimo_audio_chat 📝 Tech Blog: https://xiaomimimo.github.io/MiMo-Audio-Demo/
Pubblicato 22 set
Hugging Face (Twitter) RT @adibvafa: CodonTransformer, our open-soruce model on @huggingface that optimizes genes for protein expression has passed 250,000+ downloads!
Pubblicato 22 set
Hugging Face (Twitter) RT @AdinaYakup: MiMo-Audio 🔊 Open audio model released by @Xiaomi https://huggingface.co/collections/XiaomiMiMo/mimo-audio-68cc7202692c27dae881cce0 ✨ 7B base & instruct - MIT license ✨ Pretrained on 100M+ hours ✨ Few-shot across speech & audio tasks
Pubblicato 22 set
Hugging Face (Twitter) RT @_akhaliq: moondream3-preview is out on Hugging Face vision language model with a mixture-of-experts architecture (9B total parameters, 2B active) delivering sota visual reasoning while still being efficient and deployment-friendly vibe coded a quick app for it in anycoder
Pubblicato 19 set
Hugging Face (Twitter) RT @abidlabs: BOOM! A new, free experiment tracking library with identical syntax as wandb that makes it trivial as a drop-in replacement