Post recenti
Pag. 40 di 85 · 1,011 post
Pubblicato 28 nov
Hugging Face (Twitter) RT @jenzhuscott: The whales 🐋 is back! DeepSeek-Math-V2: 685B-parameter math monster built on V3.2-Exp-Base, fully open under Apache 2.0 - 1st model to use a generator-verifier loop in training: writes proofs → verifier scores them → RL closes the loop for self-verifiable reasoning. - Focuses on verifiable full proofs, not just final answers - huge leap for formal theorem proving. - Trained w automatic high-compute verification runs to create its own high-quality proof data at scale Enjoy 😊👇 https://huggingface.co/deepseek-ai/DeepSeek-Math-V2
Pubblicato 27 nov
Hugging Face (Twitter) RT @ClementDelangue: As far as I know, there isn't any chatbot or API that gives you access to an IMO 2025 gold-medalist model. Not only does this change today, but you get to download the weights with the Apache 2.0 open-source release of @deepseek_ai Math-V2 on @huggingface! Imagine owning the brain of one of the best mathematicians in the world for free to: - explore it for research - fine-tune it - optimize it - run it on your own hardware No limitations, no nerfing, no company or government to take it back. That's democratization of AI and knowledge at its best, literally 🤯🤯🤯 You can download the weights here: https://huggingface.co/deepseek-ai/DeepSeek-Math-V2. The frontier of AI is open-source!
Pubblicato 27 nov
Hugging Face (Twitter) RT @lhoestq: Lance support from @huggingface datasets just got merged :O Congrats @OnlyXuanwo !
Pubblicato 27 nov
Hugging Face (Twitter) RT @gm8xx8: DeepSeek-Math-V2 MODEL: https://huggingface.co/deepseek-ai/DeepSeek-Math-V2 PAPER: https://github.com/deepseek-ai/DeepSeek-Math-V2/blob/main/DeepSeekMath_V2.pdf
Pubblicato 27 nov
Hugging Face (Twitter) RT @eliebakouch: deepseek math v2 is the first open source model to reach gold on IMO? and we get a tech report, what an amazing release
Pubblicato 27 nov
Hugging Face (Twitter) RT @xeophon_: New whale 👀
Pubblicato 27 nov
Hugging Face (Twitter) RT @AdinaYakup: Z-Image 🔥 new image generation model from @Ali_TongyiLab Model: https://huggingface.co/Tongyi-MAI/Z-Image-Turbo Demo: https://huggingface.co/spaces/Tongyi-MAI/Z-Image-Turbo ✨ 6B - Apache 2.0 ✨ 8 step, sub-second generation on H800; runs on 16GB GPUs ✨ Photorealistic quality (the film poster demo is amazing🤯) ✨ English & Chinese support ✨ Turbo / Base / Edit variants
Pubblicato 27 nov
Hugging Face (Twitter) RT @vincentweisser: We are releasing INTELLECT-3: Scaling async RL to 100B+ MoE on our end to end stack SOTA for its size in math, code, reasoning, science - outpacing bigger models Fully open source: weights, data, frameworks, envs, evals Trained on the same stack we're opening up to everyone https://twitter.com/PrimeIntellect/status/1993895068290388134#m
Pubblicato 27 nov
Hugging Face (Twitter) RT @PrimeIntellect: Introducing INTELLECT-3: Scaling RL to a 100B+ MoE model on our end-to-end stack Achieving state-of-the-art performance for its size across math, code and reasoning Built using the same tools we put in your hands, from environments & evals, RL frameworks, sandboxes & more
Pubblicato 27 nov
Hugging Face (Twitter) RT @Thom_Wolf: Very excited to finally release this new long form open-science post from the team: our first work on mechanistic interpretability. @dlouapre spent several months working on a fully open-source and shareable reproduction of the « Golden Gate Claude » experiments. It was quite a journey, way less straightforward than it seemed from the start. Hence a great occasion to explore practical mechanistic interpretability challenges :) Here are the main findings of this reproduction (full interactive blog post and demo online): - The steering ‘sweet spot’ is small. The optimal steering strength is of the order of half the magnitude of a layer’s typical activation. This is consistent with the idea that steering vectors should not overwhelm the model’s natural activations. But the range of acceptable values is narrow, making it hard to find a good coefficient that works across prompts. - Clamping is more effective than adding. We found that... Перейти на оригинальный пост
Pubblicato 27 nov
Hugging Face (Twitter) RT @reach_vb: The whale is BACK!!! 👀👀👀 https://huggingface.co/deepseek-ai/DeepSeek-Math-V2
Pubblicato 27 nov
Hugging Face (Twitter) RT @LeRobotHF: 🚀 We just shipped a big upgrade to our imitation-learning-in-simulation playground in LeRobot, built together with @LightwheelAI ! You can now teleoperate robots in sim (keyboard or real robot) and collect training demos instantly. This makes it possible to run real IL research on harder, more realistic manipulation tasks, even if you don’t have hardware. New tasks: 🟠 pick orange to the plate 🧺 fold cloth (yes!) 📦 pick 2 “e” toys to the box 🔴 lift red cube Through our partnership with Lightwheel, LeIsaac was integrated into EnvHub on day one, a best-practice integration that strengthens both ecosystems and pushes simulation-first robotics forward. 👉 Load a task via EnvHub, start teleop-ing, record your demo, upload the data on the hub and you’re ready to train or test.