TGTGInsightintelligence telegramLIVE / telegram public index
Contenuto del post
Contenuto
Hugging Face (Twitter) RT @johnschulman2: Really happy to see people reproducing the result that LoRA rank=1 closely matches full fine-tuning on many RL fine-tuning problems. Here are a couple nice ones: https://twitter.com/ben_burtenshaw/status/1974191312229577085https://twitter.com/zzlccc/status/1973612326747336767#m