Skip to main content

Large Language Models

Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference
·2449 words·12 mins· loading · loading
AI Generated ๐Ÿค— Daily Papers Natural Language Processing Large Language Models ๐Ÿข Answer.AI
ModernBERT: ๋น ๋ฅด๊ณ  ๋ฉ”๋ชจ๋ฆฌ ํšจ์œจ์ ์ธ ์žฅ๋ฌธ ์ปจํ…์ŠคํŠธ ๋ฏธ์„ธ ์กฐ์ • ๋ฐ ์ถ”๋ก ์„ ์œ„ํ•œ ์ตœ์ฒจ๋‹จ ์–‘๋ฐฉํ–ฅ ์ธ์ฝ”๋”!
RAG-RewardBench: Benchmarking Reward Models in Retrieval Augmented Generation for Preference Alignment
·2978 words·14 mins· loading · loading
AI Generated ๐Ÿค— Daily Papers Natural Language Processing Large Language Models ๐Ÿข University of Chinese Academy of Sciences
RAG-RewardBench: RAG ํ™˜๊ฒฝ์—์„œ ๋ณด์ƒ ๋ชจ๋ธ ํ‰๊ฐ€๋ฅผ ์œ„ํ•œ ์ตœ์ดˆ์˜ ๋ฒค์น˜๋งˆํฌ ์ œ์‹œ!
AntiLeak-Bench: Preventing Data Contamination by Automatically Constructing Benchmarks with Updated Real-World Knowledge
·3149 words·15 mins· loading · loading
AI Generated ๐Ÿค— Daily Papers Natural Language Processing Large Language Models ๐Ÿข Nanyang Technological University
AntiLeak-Bench: ์ž๋™ํ™”๋œ ๋ฒค์น˜๋งˆํ‚น์œผ๋กœ LLM ๋ฐ์ดํ„ฐ ์˜ค์—ผ ๋ฐฉ์ง€
DateLogicQA: Benchmarking Temporal Biases in Large Language Models
·2927 words·14 mins· loading · loading
AI Generated ๐Ÿค— Daily Papers Natural Language Processing Large Language Models ๐Ÿข University of Aberdeen
DateLogicQA: LLM์˜ ์‹œ๊ฐ„์  ์ถ”๋ก  ํŽธํ–ฅ ๋ฒค์น˜๋งˆํฌ ์ œ์‹œ! ํ† ํฐํ™”, ํ‘œ์ƒ ๋ฐ ๋…ผ๋ฆฌ ์ˆ˜์ค€ ํŽธํ–ฅ ๋ถ„์„์œผ๋กœ ์‹œ๊ฐ„์  ๋ฐ์ดํ„ฐ ์ฒ˜๋ฆฌ ๊ฐœ์„  ๋ฐฉ์•ˆ ์ œ์‹œ!
Whisper-GPT: A Hybrid Representation Audio Large Language Model
·1322 words·7 mins· loading · loading
AI Generated ๐Ÿค— Daily Papers Natural Language Processing Large Language Models ๐Ÿข Stanford University
Whisper-GPT: ํ•˜์ด๋ธŒ๋ฆฌ๋“œ ์Œ์„ฑ ๋ฐ ์Œ์•… LLM์œผ๋กœ, ์—ฐ์† ์˜ค๋””์˜ค์™€ ์ด์‚ฐ ํ† ํฐ์„ ๊ฒฐํ•ฉํ•˜์—ฌ ํ–ฅ์ƒ๋œ ์„ฑ๋Šฅ์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.
The Open Source Advantage in Large Language Models (LLMs)
·248 words·2 mins· loading · loading
AI Generated ๐Ÿค— Daily Papers Natural Language Processing Large Language Models ๐Ÿข Rollins College
์˜คํ”ˆ์†Œ์Šค LLM, ํ์‡„ํ˜• LLM ๋Œ€๋น„ ํˆฌ๋ช…์„ฑ๊ณผ ์ ‘๊ทผ์„ฑ์€ ๋†’์ง€๋งŒ, ์„ฑ๋Šฅ์€ ๋‚ฎ์Œ. ํ•˜์ด๋ธŒ๋ฆฌ๋“œ ์ „๋žต์ด ๋ฏธ๋ž˜.
SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models
·3260 words·16 mins· loading · loading
AI Generated ๐Ÿค— Daily Papers Natural Language Processing Large Language Models ๐Ÿข Tsinghua University
Self-play with refinement boosts instruction-following in LLMs.
SepLLM: Accelerate Large Language Models by Compressing One Segment into One Separator
·2998 words·15 mins· loading · loading
AI Generated ๐Ÿค— Daily Papers Natural Language Processing Large Language Models ๐Ÿข Huawei Noah's Ark Lab
SepLLM์€ ํŠน์ˆ˜ ํ† ํฐ์˜ ์ค‘์š”์„ฑ์„ ํ™œ์šฉํ•˜์—ฌ LLM ์ถ”๋ก ์„ ๊ฐ€์†ํ™”ํ•˜๊ณ  ๊ธด ์‹œํ€€์Šค๋ฅผ ํšจ์œจ์ ์œผ๋กœ ์ฒ˜๋ฆฌํ•ฉ๋‹ˆ๋‹ค.
Smaller Language Models Are Better Instruction Evolvers
·4310 words·21 mins· loading · loading
AI Generated ๐Ÿค— Daily Papers Natural Language Processing Large Language Models ๐Ÿข Beijing University of Posts and Telecommunications
์†Œํ˜• ์–ธ์–ด ๋ชจ๋ธ์ด ๋” ๋‚˜์€ ๋ช…๋ น ์ƒ์„ฑ์ž!
Reliable, Reproducible, and Really Fast Leaderboards with Evalica
·1243 words·6 mins· loading · loading
AI Generated ๐Ÿค— Daily Papers Natural Language Processing Large Language Models ๐Ÿข JetBrains
Evalica: ๋ฒค์น˜๋งˆํ‚น์„ ์‰ฝ๊ณ  ๋น ๋ฅด๊ณ  ์‹ ๋ขฐํ•  ์ˆ˜ ์žˆ๊ฒŒ ๋งŒ๋“œ๋Š” ํˆดํ‚ท
SCBench: A KV Cache-Centric Analysis of Long-Context Methods
·4642 words·22 mins· loading · loading
AI Generated ๐Ÿค— Daily Papers Natural Language Processing Large Language Models ๐Ÿข Microsoft Corporation
SCBench๋Š” ๋ฉ€ํ‹ฐํ„ด ๋ฐ ๋ฉ€ํ‹ฐ๋ฆฌํ€˜์ŠคํŠธ ์‹œ๋‚˜๋ฆฌ์˜ค์—์„œ ์žฅ๋ฌธ ๋งฅ๋ฝ ๋ฉ”์„œ๋“œ๋ฅผ ํ‰๊ฐ€ํ•˜๋Š” ์ƒˆ๋กœ์šด ๋ฒค์น˜๋งˆํฌ์ž…๋‹ˆ๋‹ค.
Large Action Models: From Inception to Implementation
·2067 words·10 mins· loading · loading
AI Generated ๐Ÿค— Daily Papers Natural Language Processing Large Language Models ๐Ÿข Microsoft
LLM์—์„œ LAM์œผ๋กœ: ์‹ค์ œ ์ž‘์—…์„ ์ˆ˜ํ–‰ํ•˜๋Š” AI ์—์ด์ „ํŠธ ๊ตฌ์ถ•.
Byte Latent Transformer: Patches Scale Better Than Tokens
·3839 words·19 mins· loading · loading
AI Generated ๐Ÿค— Daily Papers Natural Language Processing Large Language Models ๐Ÿข University of Washington
BLT: ๋ฐ”์ดํŠธ ๊ธฐ๋ฐ˜ LLM, ํ† ํฐ๋ณด๋‹ค ํŒจ์น˜ ์šฐ์„ .
Phi-4 Technical Report
·2236 words·11 mins· loading · loading
AI Generated ๐Ÿค— Daily Papers Natural Language Processing Large Language Models ๐Ÿข Microsoft Research
Phi-4: 140์–ต ๋งค๊ฐœ๋ณ€์ˆ˜ ์–ธ์–ด ๋ชจ๋ธ์€ ๋ฐ์ดํ„ฐ ํ’ˆ์งˆ์— ์ค‘์ ์„ ๋‘” ํ›ˆ๋ จ ๋ ˆ์‹œํ”ผ๋กœ ๊ฐœ๋ฐœ๋˜์–ด ์ถ”๋ก  ๋Šฅ๋ ฅ์„ ๋Œ€ํญ ํ–ฅ์ƒ์‹œ์ผฐ์Šต๋‹ˆ๋‹ค.
GReaTer: Gradients over Reasoning Makes Smaller Language Models Strong Prompt Optimizers
·7101 words·34 mins· loading · loading
AI Generated ๐Ÿค— Daily Papers Natural Language Processing Large Language Models ๐Ÿข Pennsylvania State University
GREATER๋Š” ์ถ”๋ก ์— ๋Œ€ํ•œ ๊ทธ๋ ˆ์ด๋””์–ธํŠธ๋ฅผ ํ™œ์šฉํ•˜์—ฌ ์†Œ๊ทœ๋ชจ ์–ธ์–ด ๋ชจ๋ธ์˜ ํ”„๋กฌํ”„ํŠธ๋ฅผ ์ตœ์ ํ™”ํ•˜์—ฌ ๋Œ€๊ทœ๋ชจ LLM ์—†์ด๋„ ์„ฑ๋Šฅ์„ ํ–ฅ์ƒ์‹œํ‚ต๋‹ˆ๋‹ค.
SmolTulu: Higher Learning Rate to Batch Size Ratios Can Lead to Better Reasoning in SLMs
·2378 words·12 mins· loading · loading
AI Generated ๐Ÿค— Daily Papers Natural Language Processing Large Language Models ๐Ÿข Saudi Data & Artificial Intelligence Authority
Smaller language models reason better with fine-tuned training recipes.
NeuZip: Memory-Efficient Training and Inference with Dynamic Compression of Neural Networks
·2943 words·14 mins· loading · loading
AI Generated ๐Ÿค— Daily Papers Natural Language Processing Large Language Models ๐Ÿข University of Alberta
NeuZip dynamically compresses neural network weights, achieving memory-efficient training and inference without performance loss, significantly reducing the memory footprint of large language models.
Do LLMs Have Distinct and Consistent Personality? TRAIT: Personality Testset designed for LLMs with Psychometrics
·11182 words·53 mins· loading · loading
AI Generated Natural Language Processing Large Language Models ๐Ÿข Yonsei University
LLM์˜ ๊ฐœ์„ฑ์„ ์ •๋Ÿ‰์ ์œผ๋กœ ํ‰๊ฐ€ํ•˜๋Š” ์ƒˆ๋กœ์šด ๋ฒค์น˜๋งˆํฌ TRAIT ์ œ์‹œ: ์‹ ๋ขฐ์„ฑ ๋ฐ ํƒ€๋‹น์„ฑ ๋†’์€ 8,000๊ฐœ์˜ ์งˆ๋ฌธ์œผ๋กœ ๊ตฌ์„ฑ, LLM ๊ฐœ์„ฑ์˜ ๋…ํŠน์„ฑ๊ณผ ์ผ๊ด€์„ฑ ๊ทœ๋ช…, ๋ชจ๋ธ ์ •๋ ฌ ๊ณผ์ •์˜ ์˜ํ–ฅ ๋ถ„์„ ๋ฐ ์ œํ•œ์  ์ œ์‹œ.