Post recenti
Pag. 61 di 85 · 1,011 post
Pubblicato 22 ott
Hugging Face (Twitter) RT @allen_ai: We’re updating olmOCR, our model for turning PDFs & scans into clean text with support for tables, equations, handwriting, & more. olmOCR 2 uses synthetic data + unit tests as verifiable rewards to reach state-of-the-art performance on challenging documents. 🧵
Pubblicato 22 ott
Hugging Face (Twitter) RT @abidlabs: Help us get to 1000 stars! https://github.com/gradio-app/trackio
Pubblicato 22 ott
Hugging Face (Twitter) RT @tomaarsen: 🤗 Sentence Transformers is joining @huggingface! 🤗 This formalizes the existing maintenance structure, as I've personally led the project for the past two years on behalf of Hugging Face. I'm super excited about the transfer! Details in 🧵
Pubblicato 21 ott
Hugging Face (Twitter) RT @OdinLovis: 🚀 Update Next Scene V2 only 10 days after last version, now live on Hugging Face 👉https://huggingface.co/lovis93/next-scene-qwen-image-lora-2509 🎬 A LoRA made for Qwen Image Edit 2509 that lets you create seamless cinematic “next shots” — keeping the same characters, lighting, and mood. I trained this new version on thousands of paired cinematic shots to make scene transitions smoother, more emotional, and real. 🧠 What’s new: • Much stronger consistency across shots • Better lighting and character preservation • Smoother transitions and framing logic • No more black bar artifacts Built for storytellers using @ComfyUI or any diffusers pipeline. Just use “Next Scene:” and describe what happens next , the model keeps everything coherent. 🧩 Try it directly in ComfyUI, or check the thread to launch it on @fal .... Перейти на оригинальный пост
Pubblicato 21 ott
Hugging Face (Twitter) RT @reach_vb: Oh wow, Qwen dropped smol 2B VLM 🔥 https://huggingface.co/Qwen/Qwen3-VL-2B-Thinking
Pubblicato 21 ott
Hugging Face (Twitter) RT @UnslothAI: We just hit 100 million lifetime downloads on Hugging Face! 🦥🤗 Huge thanks to all of you! The amazing community, model creators, and HF team. 💖
Pubblicato 21 ott
Hugging Face (Twitter) RT @nathanhabib1011: come get your @huggingface merch at the @PyTorch conference 🤗
Pubblicato 21 ott
Hugging Face (Twitter) RT @mervenoyann: open-source OCR models are super cheap to run and privacy first 🤝 BUT there's a ton of new models out there: DeepSeek-OCR, Nanonets, PaddleOCR, how do you pick them? 🤯 don't worry though, @huggingface got you covered! 🫡🧶
Pubblicato 21 ott
Hugging Face (Twitter) RT @krea_ai: today we're open-sourcing Krea Realtime. this 14B autoregressive model is 10x larger than any open-source equivalent, and it can generate long-form videos at 11 fps on a single B200. weights and technical report below 👇
Pubblicato 21 ott
Hugging Face (Twitter) RT @eliebakouch: DeepSeek-OCR has some weird architectural choices for the LLM decoder: DeepSeek3B-MoE-A570M -> uses MHA, no MLA (not even GQA?) -> 2 shared experts (like DeepSeek V2, but V3 only has 1) -> quite low sparsity, activation ratio is 12.5%. For V3 it’s 3.52%, for V2 it’s 5% -> not very deep, 12 layers
Pubblicato 21 ott
Hugging Face (Twitter) RT @RayFernando1337: This is the JPEG moment for AI. Optical compression doesn't just make context cheaper. It makes AI memory architectures viable. Training data bottlenecks? Solved. - 200k pages/day on ONE GPU - 33M pages/day on 20 nodes - Every multimodal model is data-constrained. Not anymore. Agent memory problem? Solved. - The #1 blocker: agents forget - Progressive compression = natural forgetting curve - Agents can now run indefinitely without context collapse RAG might be obsolete. - Why chunk and retrieve if you can compress entire libraries into context? - A 10,000-page corpus = 10M text tokens OR 1M vision tokens - You just fit the whole thing in context Multimodal training data generation: 10x more efficient - If you're OpenAI/Anthropic/Google and you DON'T integrate this, you're 10x slower - This is a Pareto improvement: better AND faster Real-time AI becomes economically viable - Live document analysis - Streaming OCR for... Перейти на оригинальный пост
Pubblicato 21 ott
Hugging Face (Twitter) RT @mervenoyann: DeepSeek-OCR is out! 🔥 my take ⤵️ > pretty insane it can parse and re-render charts in HTML > it uses CLIP and SAM features concatenated, so better grounding > very efficient per vision tokens/performance ratio > covers 100 languages