データセットを簡易化するだけでは単純に性能向上することはなかったので,まじめに論文読みから.
| タイトル | 著者 | リンク |
|---|---|---|
| Not All Tokens Are What You Need for Pretraining | Zhenghao Lin, et al. | https://proceedings.neurips.cc/paper_files/paper/2024/file/3322a9a72a1707de14badd5e552ff466-Paper-Conference.pdf |
| Findings of the Third BabyLM Challenge: | ||
| Accelerating Language Modeling Research | ||
| with Cognitively Plausible Data | Lucas Charpentier, et al. | https://aclanthology.org/2025.babylm-main.28.pdf |
| CLASS-IT: Conversational and Lecture-Aligned Small-Scale Instruction | ||
| Tuning for BabyLMs | Luca Capone, et al. | https://aclanthology.org/2025.babylm-main.30.pdf |