Large Language Models
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference
·2449 words·12 mins·
loading
·
loading
AI Generated
๐ค Daily Papers
Natural Language Processing
Large Language Models
๐ข Answer.AI
ModernBERT: ๋น ๋ฅด๊ณ ๋ฉ๋ชจ๋ฆฌ ํจ์จ์ ์ธ ์ฅ๋ฌธ ์ปจํ
์คํธ ๋ฏธ์ธ ์กฐ์ ๋ฐ ์ถ๋ก ์ ์ํ ์ต์ฒจ๋จ ์๋ฐฉํฅ ์ธ์ฝ๋!
RAG-RewardBench: Benchmarking Reward Models in Retrieval Augmented Generation for Preference Alignment
·2978 words·14 mins·
loading
·
loading
AI Generated
๐ค Daily Papers
Natural Language Processing
Large Language Models
๐ข University of Chinese Academy of Sciences
RAG-RewardBench: RAG ํ๊ฒฝ์์ ๋ณด์ ๋ชจ๋ธ ํ๊ฐ๋ฅผ ์ํ ์ต์ด์ ๋ฒค์น๋งํฌ ์ ์!
AntiLeak-Bench: Preventing Data Contamination by Automatically Constructing Benchmarks with Updated Real-World Knowledge
·3149 words·15 mins·
loading
·
loading
AI Generated
๐ค Daily Papers
Natural Language Processing
Large Language Models
๐ข Nanyang Technological University
AntiLeak-Bench: ์๋ํ๋ ๋ฒค์น๋งํน์ผ๋ก LLM ๋ฐ์ดํฐ ์ค์ผ ๋ฐฉ์ง
DateLogicQA: Benchmarking Temporal Biases in Large Language Models
·2927 words·14 mins·
loading
·
loading
AI Generated
๐ค Daily Papers
Natural Language Processing
Large Language Models
๐ข University of Aberdeen
DateLogicQA: LLM์ ์๊ฐ์ ์ถ๋ก ํธํฅ ๋ฒค์น๋งํฌ ์ ์! ํ ํฐํ, ํ์ ๋ฐ ๋
ผ๋ฆฌ ์์ค ํธํฅ ๋ถ์์ผ๋ก ์๊ฐ์ ๋ฐ์ดํฐ ์ฒ๋ฆฌ ๊ฐ์ ๋ฐฉ์ ์ ์!
Whisper-GPT: A Hybrid Representation Audio Large Language Model
·1322 words·7 mins·
loading
·
loading
AI Generated
๐ค Daily Papers
Natural Language Processing
Large Language Models
๐ข Stanford University
Whisper-GPT: ํ์ด๋ธ๋ฆฌ๋ ์์ฑ ๋ฐ ์์
LLM์ผ๋ก, ์ฐ์ ์ค๋์ค์ ์ด์ฐ ํ ํฐ์ ๊ฒฐํฉํ์ฌ ํฅ์๋ ์ฑ๋ฅ์ ์ ๊ณตํฉ๋๋ค.
The Open Source Advantage in Large Language Models (LLMs)
·248 words·2 mins·
loading
·
loading
AI Generated
๐ค Daily Papers
Natural Language Processing
Large Language Models
๐ข Rollins College
์คํ์์ค LLM, ํ์ํ LLM ๋๋น ํฌ๋ช
์ฑ๊ณผ ์ ๊ทผ์ฑ์ ๋์ง๋ง, ์ฑ๋ฅ์ ๋ฎ์. ํ์ด๋ธ๋ฆฌ๋ ์ ๋ต์ด ๋ฏธ๋.
SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models
·3260 words·16 mins·
loading
·
loading
AI Generated
๐ค Daily Papers
Natural Language Processing
Large Language Models
๐ข Tsinghua University
Self-play with refinement boosts instruction-following in LLMs.
SepLLM: Accelerate Large Language Models by Compressing One Segment into One Separator
·2998 words·15 mins·
loading
·
loading
AI Generated
๐ค Daily Papers
Natural Language Processing
Large Language Models
๐ข Huawei Noah's Ark Lab
SepLLM์ ํน์ ํ ํฐ์ ์ค์์ฑ์ ํ์ฉํ์ฌ LLM ์ถ๋ก ์ ๊ฐ์ํํ๊ณ ๊ธด ์ํ์ค๋ฅผ ํจ์จ์ ์ผ๋ก ์ฒ๋ฆฌํฉ๋๋ค.
Smaller Language Models Are Better Instruction Evolvers
·4310 words·21 mins·
loading
·
loading
AI Generated
๐ค Daily Papers
Natural Language Processing
Large Language Models
๐ข Beijing University of Posts and Telecommunications
์ํ ์ธ์ด ๋ชจ๋ธ์ด ๋ ๋์ ๋ช
๋ น ์์ฑ์!
Reliable, Reproducible, and Really Fast Leaderboards with Evalica
·1243 words·6 mins·
loading
·
loading
AI Generated
๐ค Daily Papers
Natural Language Processing
Large Language Models
๐ข JetBrains
Evalica: ๋ฒค์น๋งํน์ ์ฝ๊ณ ๋น ๋ฅด๊ณ ์ ๋ขฐํ ์ ์๊ฒ ๋ง๋๋ ํดํท
SCBench: A KV Cache-Centric Analysis of Long-Context Methods
·4642 words·22 mins·
loading
·
loading
AI Generated
๐ค Daily Papers
Natural Language Processing
Large Language Models
๐ข Microsoft Corporation
SCBench๋ ๋ฉํฐํด ๋ฐ ๋ฉํฐ๋ฆฌํ์คํธ ์๋๋ฆฌ์ค์์ ์ฅ๋ฌธ ๋งฅ๋ฝ ๋ฉ์๋๋ฅผ ํ๊ฐํ๋ ์๋ก์ด ๋ฒค์น๋งํฌ์
๋๋ค.
Large Action Models: From Inception to Implementation
·2067 words·10 mins·
loading
·
loading
AI Generated
๐ค Daily Papers
Natural Language Processing
Large Language Models
๐ข Microsoft
LLM์์ LAM์ผ๋ก: ์ค์ ์์
์ ์ํํ๋ AI ์์ด์ ํธ ๊ตฌ์ถ.
Byte Latent Transformer: Patches Scale Better Than Tokens
·3839 words·19 mins·
loading
·
loading
AI Generated
๐ค Daily Papers
Natural Language Processing
Large Language Models
๐ข University of Washington
BLT: ๋ฐ์ดํธ ๊ธฐ๋ฐ LLM, ํ ํฐ๋ณด๋ค ํจ์น ์ฐ์ .
Phi-4 Technical Report
·2236 words·11 mins·
loading
·
loading
AI Generated
๐ค Daily Papers
Natural Language Processing
Large Language Models
๐ข Microsoft Research
Phi-4: 140์ต ๋งค๊ฐ๋ณ์ ์ธ์ด ๋ชจ๋ธ์ ๋ฐ์ดํฐ ํ์ง์ ์ค์ ์ ๋ ํ๋ จ ๋ ์ํผ๋ก ๊ฐ๋ฐ๋์ด ์ถ๋ก ๋ฅ๋ ฅ์ ๋ํญ ํฅ์์์ผฐ์ต๋๋ค.
GReaTer: Gradients over Reasoning Makes Smaller Language Models Strong Prompt Optimizers
·7101 words·34 mins·
loading
·
loading
AI Generated
๐ค Daily Papers
Natural Language Processing
Large Language Models
๐ข Pennsylvania State University
GREATER๋ ์ถ๋ก ์ ๋ํ ๊ทธ๋ ์ด๋์ธํธ๋ฅผ ํ์ฉํ์ฌ ์๊ท๋ชจ ์ธ์ด ๋ชจ๋ธ์ ํ๋กฌํํธ๋ฅผ ์ต์ ํํ์ฌ ๋๊ท๋ชจ LLM ์์ด๋ ์ฑ๋ฅ์ ํฅ์์ํต๋๋ค.
SmolTulu: Higher Learning Rate to Batch Size Ratios Can Lead to Better Reasoning in SLMs
·2378 words·12 mins·
loading
·
loading
AI Generated
๐ค Daily Papers
Natural Language Processing
Large Language Models
๐ข Saudi Data & Artificial Intelligence Authority
Smaller language models reason better with fine-tuned training recipes.
NeuZip: Memory-Efficient Training and Inference with Dynamic Compression of Neural Networks
·2943 words·14 mins·
loading
·
loading
AI Generated
๐ค Daily Papers
Natural Language Processing
Large Language Models
๐ข University of Alberta
NeuZip dynamically compresses neural network weights, achieving memory-efficient training and inference without performance loss, significantly reducing the memory footprint of large language models.
Do LLMs Have Distinct and Consistent Personality? TRAIT: Personality Testset designed for LLMs with Psychometrics
·11182 words·53 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
๐ข Yonsei University
LLM์ ๊ฐ์ฑ์ ์ ๋์ ์ผ๋ก ํ๊ฐํ๋ ์๋ก์ด ๋ฒค์น๋งํฌ TRAIT ์ ์: ์ ๋ขฐ์ฑ ๋ฐ ํ๋น์ฑ ๋์ 8,000๊ฐ์ ์ง๋ฌธ์ผ๋ก ๊ตฌ์ฑ, LLM ๊ฐ์ฑ์ ๋
ํน์ฑ๊ณผ ์ผ๊ด์ฑ ๊ท๋ช
, ๋ชจ๋ธ ์ ๋ ฌ ๊ณผ์ ์ ์ํฅ ๋ถ์ ๋ฐ ์ ํ์ ์ ์.