TGTGInsightintelligence telegramLIVE / telegram public index
Contenuto del post
Contenuto
Hugging Face (Twitter) RT @Thom_Wolf: We’ve cooked another one of these 200+ pages practical books on model training that we love to write. This time it’s on all pretraining and post-training recipes and how to do a training project hyper parameter exploration. Closing the trilogy of: 1. Building a pretraining dataset with the « FineWeb blog post » 2. Scaling infra GPU cluster with the « Ultrascale Playbook » 3. And now all the training recipes and HP exploration for pre- and post-training with this « Smol Training Playbook » The HF science team on fire https://twitter.com/eliebakouch/status/1983930328751153159#m