Skip to main content

🏢 Saudi Data & Artificial Intelligence Authority

SmolTulu: Higher Learning Rate to Batch Size Ratios Can Lead to Better Reasoning in SLMs
·2378 words·12 mins· loading · loading
AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Saudi Data & Artificial Intelligence Authority
Smaller language models reason better with fine-tuned training recipes.