Skip to main content

Natural Language Processing

ResearchTown: Simulator of Human Research Community
·16894 words·80 mins· loading · loading
AI Generated ๐Ÿค— Daily Papers Natural Language Processing Large Language Models ๐Ÿข University of Illinois Urbana-Champaign
RESEARCHTOWN: LLM ๊ธฐ๋ฐ˜ ์ธ๊ฐ„ ์—ฐ๊ตฌ ๊ณต๋™์ฒด ์‹œ๋ฎฌ๋ ˆ์ดํ„ฐ๋กœ, ๋‹ค์–‘ํ•œ ์—ฐ๊ตฌ ํ™œ๋™์„ ํ˜„์‹ค์ ์œผ๋กœ ๋ชจ๋ฐฉํ•˜๋ฉฐ ํ•™์ œ ๊ฐ„ ์—ฐ๊ตฌ ์•„์ด๋””์–ด ์ƒ์„ฑ ๊ฐ€๋Šฅ
In Case You Missed It: ARC 'Challenge' Is Not That Challenging
·2275 words·11 mins· loading · loading
AI Generated ๐Ÿค— Daily Papers Natural Language Processing Large Language Models ๐Ÿข Snowflake AI Research
๊ธฐ์กด ๋‹ค์ค‘ ์„ ํƒ ๋ฌธ์ œ ํ‰๊ฐ€ ๋ฐฉ์‹์˜ ์˜ค๋ฅ˜๋ฅผ ์ง€์ ํ•˜๊ณ , ๋ชจ๋“  ์˜ต์…˜์„ ํ•จ๊ป˜ ๊ณ ๋ คํ•˜๋Š” ์ƒˆ๋กœ์šด ํ‰๊ฐ€ ๋ฐฉ์‹์„ ์ œ์•ˆํ•˜์—ฌ ๋ชจ๋ธ ์„ฑ๋Šฅ ํ‰๊ฐ€์˜ ์ •ํ™•์„ฑ์„ ๋†’์˜€์Šต๋‹ˆ๋‹ค.
Friends-MMC: A Dataset for Multi-modal Multi-party Conversation Understanding
·1812 words·9 mins· loading · loading
AI Generated ๐Ÿค— Daily Papers Natural Language Processing Dialogue Systems ๐Ÿข Peking University
Friends-MMC: ๋ฐฉ๋Œ€ํ•œ ๋น„๋””์˜ค ๋ฐ์ดํ„ฐ์™€ ์ฃผ์„์„ ํฌํ•จํ•œ ์ƒˆ๋กœ์šด ๋‹ค์ค‘ ๋ชจ๋‹ฌ ๋‹ค์ค‘ ์ฐธ์—ฌ ๋Œ€ํ™” ๋ฐ์ดํ„ฐ์…‹์„ ํ†ตํ•ด ์‹ค์ œ ์„ธ๊ณ„์˜ ๋Œ€ํ™” ์ดํ•ด๋ฅผ ์œ„ํ•œ ์ƒˆ๋กœ์šด ๊ฐ€๋Šฅ์„ฑ์„ ์ œ์‹œํ•ฉ๋‹ˆ๋‹ค!
Fourier Position Embedding: Enhancing Attention's Periodic Extension for Length Generalization
·1717 words·9 mins· loading · loading
AI Generated ๐Ÿค— Daily Papers Natural Language Processing Large Language Models ๐Ÿข Tsinghua University
FoPE: ์ฃผํŒŒ์ˆ˜ ์˜์—ญ ํŠน์ง• ๊ฐœ์„ ์œผ๋กœ ๊ธด ๋ฌธ๋งฅ ๊ธธ์ด ์ผ๋ฐ˜ํ™” ๋‹ฌ์„ฑ!
DRT-o1: Optimized Deep Reasoning Translation via Long Chain-of-Thought
·366 words·2 mins· loading · loading
AI Generated ๐Ÿค— Daily Papers Natural Language Processing Machine Translation ๐Ÿข Tencent AI Lab
DRT-01 ๋ชจ๋ธ์€ ์žฅ๋ฌธ์˜ ์‚ฌ๊ณ  ๊ณผ์ •์„ ํ™œ์šฉํ•˜์—ฌ ๋ฌธํ•™ ๋ฒˆ์—ญ์˜ ์ •ํ™•๋„์™€ ์œ ์ฐฝ์„ฑ์„ ํฌ๊ฒŒ ํ–ฅ์ƒ์‹œ์ผฐ์Šต๋‹ˆ๋‹ค.
Diving into Self-Evolving Training for Multimodal Reasoning
·2584 words·13 mins· loading · loading
AI Generated ๐Ÿค— Daily Papers Natural Language Processing Large Language Models ๐Ÿข Hong Kong University of Science and Technology
M-STAR: ๋‹ค๋ชจ๋‹ฌ ์ถ”๋ก ์„ ์œ„ํ•œ ์ž๊ธฐ ์ง„ํ™” ํ›ˆ๋ จ์˜ ์ƒˆ๋กœ์šด ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ์ œ์‹œ!
Deliberation in Latent Space via Differentiable Cache Augmentation
·2751 words·13 mins· loading · loading
AI Generated ๐Ÿค— Daily Papers Natural Language Processing Large Language Models ๐Ÿข Google DeepMind
๋Œ€๊ทœ๋ชจ ์–ธ์–ด ๋ชจ๋ธ์˜ ์ถ”๋ก  ์„ฑ๋Šฅ์„ ํ–ฅ์ƒ์‹œํ‚ค๋Š” ์ƒˆ๋กœ์šด ๋ฐฉ๋ฒ•์ธ โ€˜์ฐจ๋ณ„ ๊ฐ€๋Šฅํ•œ ์บ์‹œ ์ฆ๊ฐ•โ€™ ๊ธฐ๋ฒ• ์ œ์‹œ!
B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners
·1797 words·9 mins· loading · loading
AI Generated ๐Ÿค— Daily Papers Natural Language Processing Large Language Models ๐Ÿข Hong Kong University of Science and Technology
B-STAR: ์ž๊ธฐ ํ•™์Šต ์ถ”๋ก ์ž์—์„œ ํƒ์ƒ‰๊ณผ ํ™œ์šฉ์˜ ๊ท ํ˜•์„ ๋ชจ๋‹ˆํ„ฐ๋งํ•˜๊ณ  ์กฐ์ •ํ•˜์—ฌ ์„ฑ๋Šฅ์„ ํ–ฅ์ƒ์‹œํ‚ค๋Š” ์ƒˆ๋กœ์šด ํ”„๋ ˆ์ž„์›Œํฌ
Revisiting In-Context Learning with Long Context Language Models
·3818 words·18 mins· loading · loading
AI Generated ๐Ÿค— Daily Papers Natural Language Processing Large Language Models ๐Ÿข Google DeepMind
์žฅ๋ฌธ ์ปจํ…์ŠคํŠธ ์–ธ์–ด ๋ชจ๋ธ์—์„œ ์ •๊ตํ•œ ์ƒ˜ํ”Œ ์„ ํƒ ์ „๋žต๋ณด๋‹ค ๋ฌด์ž‘์œ„ ์ƒ˜ํ”Œ๋ง์ด ICL ์„ฑ๋Šฅ ํ–ฅ์ƒ์— ๋” ํšจ๊ณผ์ ์ด๋ฉฐ, ๋ฐ์ดํ„ฐ ์ฆ๊ฐ•์„ ํ†ตํ•ด ์ €์ž์› ์ž‘์—… ์„ฑ๋Šฅ์„ 5% ํ–ฅ์ƒ์‹œ์ผฐ๋‹ค๋Š” ๋†€๋ผ์šด ์—ฐ๊ตฌ ๊ฒฐ๊ณผ๋ฅผ ๋ฐœํ‘œ!
OpenRFT: Adapting Reasoning Foundation Model for Domain-specific Tasks with Reinforcement Fine-Tuning
·1880 words·9 mins· loading · loading
AI Generated ๐Ÿค— Daily Papers Natural Language Processing Large Language Models ๐Ÿข Beijing Jiaotong University
OpenRFT๋Š” ์ œํ•œ๋œ ๋„๋ฉ”์ธ ํŠน์ • ๋ฐ์ดํ„ฐ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ผ๋ฐ˜์ ์ธ ์ถ”๋ก  ๋ชจ๋ธ์„ ๋ฏธ์„ธ ์กฐ์ •ํ•˜๋Š” ์ƒˆ๋กœ์šด ๋ฐฉ๋ฒ•์„ ์ œ์‹œํ•ฉ๋‹ˆ๋‹ค.
NILE: Internal Consistency Alignment in Large Language Models
·2709 words·13 mins· loading · loading
AI Generated ๐Ÿค— Daily Papers Natural Language Processing Large Language Models ๐Ÿข Chinese University of Hong Kong
NILE ํ”„๋ ˆ์ž„์›Œํฌ๋Š” LLM์˜ ๋‚ด๋ถ€ ์ง€์‹๊ณผ IFT ๋ฐ์ดํ„ฐ์…‹์˜ ์„ธ๊ณ„ ์ง€์‹ ๊ฐ„ ์ผ๊ด€์„ฑ์„ ๋†’์—ฌ LLM ์„ฑ๋Šฅ์„ ์ตœ๋Œ€ 68.5%๊นŒ์ง€ ํ–ฅ์ƒ์‹œํ‚ต๋‹ˆ๋‹ค.
Multi-LLM Text Summarization
·2623 words·13 mins· loading · loading
AI Generated ๐Ÿค— Daily Papers Natural Language Processing Text Summarization ๐Ÿข UC Santa Cruz
๋‹ค์ˆ˜์˜ ๊ฑฐ๋Œ€ ์–ธ์–ด ๋ชจ๋ธ(LLM)์„ ํ™œ์šฉํ•œ ํ˜์‹ ์ ์ธ ์žฅ๋ฌธ ์š”์•ฝ ํ”„๋ ˆ์ž„์›Œํฌ๊ฐ€ ์ œ์‹œ๋˜์–ด ์š”์•ฝ ํ’ˆ์งˆ์„ ์ตœ๋Œ€ 3๋ฐฐ ํ–ฅ์ƒ์‹œ์ผฐ์Šต๋‹ˆ๋‹ค!
Ensembling Large Language Models with Process Reward-Guided Tree Search for Better Complex Reasoning
·4085 words·20 mins· loading · loading
AI Generated ๐Ÿค— Daily Papers Natural Language Processing Large Language Models ๐Ÿข Microsoft Research
๋Œ€๊ทœ๋ชจ ์–ธ์–ด ๋ชจ๋ธ๋“ค์˜ ์•™์ƒ๋ธ”์„ ํ†ตํ•ด ๋ณต์žกํ•œ ์ถ”๋ก  ๋ฌธ์ œ๋ฅผ ๋”์šฑ ํšจ๊ณผ์ ์œผ๋กœ ํ•ด๊ฒฐํ•˜๋Š” ์ƒˆ๋กœ์šด ํ”„๋ ˆ์ž„์›Œํฌ, LE-MCTS๋ฅผ ์ œ์•ˆํ•ฉ๋‹ˆ๋‹ค!
TOMG-Bench: Evaluating LLMs on Text-based Open Molecule Generation
·3930 words·19 mins· loading · loading
AI Generated ๐Ÿค— Daily Papers Natural Language Processing Large Language Models ๐Ÿข Hong Kong Polytechnic University
TOMG-Bench: LLM ๊ธฐ๋ฐ˜ ์˜คํ”ˆ ๋ถ„์ž ์ƒ์„ฑ ๋ฒค์น˜๋งˆํฌ ์ œ์‹œ! 25๊ฐœ LLM ํ‰๊ฐ€ ๋ฐ ์ƒˆ๋กœ์šด instruction tuning ๋ฐ์ดํ„ฐ์…‹ OpenMolIns ๊ณต๊ฐœ๋กœ, ์˜คํ”ˆ์†Œ์Šค LLM์˜ ์„ฑ๋Šฅ ํ–ฅ์ƒ ๋ฐ ๋ถ„์ž ๋ฐœ๊ฒฌ์˜ ์ƒˆ๋กœ์šด ๊ฐ€๋Šฅ์„ฑ ์ œ์‹œ!
RobustFT: Robust Supervised Fine-tuning for Large Language Models under Noisy Response
·2295 words·11 mins· loading · loading
AI Generated ๐Ÿค— Daily Papers Natural Language Processing Large Language Models ๐Ÿข Peking University
ROBUSTFT๋Š” ์žก์Œ์ด ํฌํ•จ๋œ ์‘๋‹ต ์•„๋ž˜์—์„œ ๋Œ€๊ทœ๋ชจ ์–ธ์–ด ๋ชจ๋ธ์˜ ๊ฐ•๊ฑดํ•œ ์ง€๋„ ํ•™์Šต ๋ฏธ์„ธ ์กฐ์ •์„ ์œ„ํ•œ ํ”„๋ ˆ์ž„์›Œํฌ๋กœ, ์žก์Œ ๊ฐ์ง€ ๋ฐ ์žฌ๋ผ๋ฒจ๋ง์„ ํ†ตํ•ด ํ•˜๋ฅ˜ ์ž‘์—… ์„ฑ๋Šฅ์„ ํ–ฅ์ƒ์‹œํ‚ต๋‹ˆ๋‹ค.
ReMoE: Fully Differentiable Mixture-of-Experts with ReLU Routing
·4863 words·23 mins· loading · loading
AI Generated ๐Ÿค— Daily Papers Natural Language Processing Large Language Models ๐Ÿข Tsinghua University
ReLU ๋ผ์šฐํŒ…์„ ์‚ฌ์šฉํ•˜๋Š” ์™„์ „ ๋ฏธ๋ถ„ ๊ฐ€๋Šฅํ•œ MoE ์•„ํ‚คํ…์ฒ˜ ReMoE๋ฅผ ํ†ตํ•ด ๋Œ€๊ทœ๋ชจ ์–ธ์–ด ๋ชจ๋ธ์˜ ํ™•์žฅ์„ฑ๊ณผ ํšจ์œจ์„ฑ์„ ํš๊ธฐ์ ์œผ๋กœ ๊ฐœ์„ ํ–ˆ์Šต๋‹ˆ๋‹ค!
Outcome-Refining Process Supervision for Code Generation
·2498 words·12 mins· loading · loading
AI Generated ๐Ÿค— Daily Papers Natural Language Processing Large Language Models ๐Ÿข Peking University
๋ณต์žกํ•œ ์•Œ๊ณ ๋ฆฌ์ฆ˜ ์ถ”๋ก ์ด ํ•„์š”ํ•œ ์ฝ”๋“œ ์ƒ์„ฑ ๊ณผ์ œ์—์„œ ๊ธฐ์กด์˜ ํ•œ๊ณ„๋ฅผ ๊ทน๋ณตํ•˜๋Š” ์ƒˆ๋กœ์šด ๋ฐฉ๋ฒ•๋ก , Outcome-Refining Process Supervision (ORPS) ์ œ์‹œ
MixLLM: LLM Quantization with Global Mixed-precision between Output-features and Highly-efficient System Design
·2237 words·11 mins· loading · loading
AI Generated ๐Ÿค— Daily Papers Natural Language Processing Large Language Models ๐Ÿข Microsoft Research
MixLLM: ์ถœ๋ ฅ ํŠน์ง• ๊ฐ„์˜ ์ „์—ญ ํ˜ผํ•ฉ ์ •๋ฐ€๋„ ์–‘์žํ™”์™€ ๊ณ ํšจ์œจ ์‹œ์Šคํ…œ ์„ค๊ณ„๋ฅผ ํ†ตํ•ด LLM์˜ ์ •ํ™•๋„์™€ ํšจ์œจ์„ฑ์„ ๋™์‹œ์— ํ–ฅ์ƒ์‹œํ‚ค๋Š” ํš๊ธฐ์ ์ธ ์–‘์žํ™” ๋ฐฉ๋ฒ•
LLMs Lost in Translation: M-ALERT uncovers Cross-Linguistic Safety Gaps
·7524 words·36 mins· loading · loading
AI Generated ๐Ÿค— Daily Papers Natural Language Processing Large Language Models ๐Ÿข TU Darmstadt
M-ALERT๋Š” ๋‹ค๊ตญ์–ด LLM์˜ ์•ˆ์ „์„ฑ์„ ํ‰๊ฐ€ํ•˜๊ธฐ ์œ„ํ•œ ์ƒˆ๋กœ์šด ๋ฒค์น˜๋งˆํฌ์ž…๋‹ˆ๋‹ค. ์˜์–ด, ํ”„๋ž‘์Šค์–ด, ๋…์ผ์–ด, ์ดํƒˆ๋ฆฌ์•„์–ด, ์ŠคํŽ˜์ธ์–ด 5๊ฐœ ์–ธ์–ด์˜ 75,000๊ฐœ ํ”„๋กฌํ”„ํŠธ๋ฅผ ํฌํ•จํ•˜๋ฉฐ, ๋‹ค์–‘ํ•œ ์–ธ์–ด ๋ฐ ๋ฒ”์ฃผ์—์„œ LLM์˜ ์•ˆ์ „์„ฑ ๋ถˆ์ผ์น˜๋ฅผ ๋ฐํ˜€๋ƒˆ์Šต๋‹ˆ๋‹ค.
How to Synthesize Text Data without Model Collapse?
·5005 words·24 mins· loading · loading
AI Generated ๐Ÿค— Daily Papers Natural Language Processing Large Language Models ๐Ÿข Tsinghua University
ํ•ฉ์„ฑ ๋ฐ์ดํ„ฐ ๊ธฐ๋ฐ˜ ์–ธ์–ด ๋ชจ๋ธ ํ•™์Šต์˜ ๋ถ•๊ดด ๋ฌธ์ œ ํ•ด๊ฒฐ: ํ† ํฐ ํŽธ์ง‘ ๊ธฐ๋ฒ• ์ œ์‹œ!