Skip to main content

Natural Language Processing

SepLLM: Accelerate Large Language Models by Compressing One Segment into One Separator
·2998 words·15 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 Huawei Noah's Ark Lab
SepLLM은 특수 ν† ν°μ˜ μ€‘μš”μ„±μ„ ν™œμš©ν•˜μ—¬ LLM 좔둠을 κ°€μ†ν™”ν•˜κ³  κΈ΄ μ‹œν€€μŠ€λ₯Ό 효율적으둜 μ²˜λ¦¬ν•©λ‹ˆλ‹€.
RetroLLM: Empowering Large Language Models to Retrieve Fine-grained Evidence within Generation
·3747 words·18 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Question Answering 🏒 Renmin University of China
RetroLLM: 검색과 생성을 ν†΅ν•©ν•œ RAG μ‹œμŠ€ν…œ
Smaller Language Models Are Better Instruction Evolvers
·4310 words·21 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 Beijing University of Posts and Telecommunications
μ†Œν˜• μ–Έμ–΄ λͺ¨λΈμ΄ 더 λ‚˜μ€ λͺ…λ Ή μƒμ„±μž!
Reliable, Reproducible, and Really Fast Leaderboards with Evalica
·1243 words·6 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 JetBrains
Evalica: λ²€μΉ˜λ§ˆν‚Ήμ„ 쉽고 λΉ λ₯΄κ³  μ‹ λ’°ν•  수 있게 λ§Œλ“œλŠ” νˆ΄ν‚·
SCBench: A KV Cache-Centric Analysis of Long-Context Methods
·4642 words·22 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 Microsoft Corporation
SCBenchλŠ” λ©€ν‹°ν„΄ 및 λ©€ν‹°λ¦¬ν€˜μŠ€νŠΈ μ‹œλ‚˜λ¦¬μ˜€μ—μ„œ μž₯λ¬Έ λ§₯락 λ©”μ„œλ“œλ₯Ό ν‰κ°€ν•˜λŠ” μƒˆλ‘œμš΄ λ²€μΉ˜λ§ˆν¬μž…λ‹ˆλ‹€.
Large Action Models: From Inception to Implementation
·2067 words·10 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 Microsoft
LLMμ—μ„œ LAM으둜: μ‹€μ œ μž‘μ—…μ„ μˆ˜ν–‰ν•˜λŠ” AI μ—μ΄μ „νŠΈ ꡬ좕.
Byte Latent Transformer: Patches Scale Better Than Tokens
·3839 words·19 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 University of Washington
BLT: λ°”μ΄νŠΈ 기반 LLM, 토큰보닀 패치 μš°μ„ .
Phi-4 Technical Report
·2236 words·11 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 Microsoft Research
Phi-4: 140μ–΅ λ§€κ°œλ³€μˆ˜ μ–Έμ–΄ λͺ¨λΈμ€ 데이터 ν’ˆμ§ˆμ— 쀑점을 λ‘” ν›ˆλ ¨ λ ˆμ‹œν”Όλ‘œ κ°œλ°œλ˜μ–΄ μΆ”λ‘  λŠ₯λ ₯을 λŒ€ν­ ν–₯μƒμ‹œμΌ°μŠ΅λ‹ˆλ‹€.
GReaTer: Gradients over Reasoning Makes Smaller Language Models Strong Prompt Optimizers
·7101 words·34 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 Pennsylvania State University
GREATERλŠ” 좔둠에 λŒ€ν•œ κ·Έλ ˆμ΄λ””μ–ΈνŠΈλ₯Ό ν™œμš©ν•˜μ—¬ μ†Œκ·œλͺ¨ μ–Έμ–΄ λͺ¨λΈμ˜ ν”„λ‘¬ν”„νŠΈλ₯Ό μ΅œμ ν™”ν•˜μ—¬ λŒ€κ·œλͺ¨ LLM 없이도 μ„±λŠ₯을 ν–₯μƒμ‹œν‚΅λ‹ˆλ‹€.
SmolTulu: Higher Learning Rate to Batch Size Ratios Can Lead to Better Reasoning in SLMs
·2378 words·12 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 Saudi Data & Artificial Intelligence Authority
Smaller language models reason better with fine-tuned training recipes.
NeuZip: Memory-Efficient Training and Inference with Dynamic Compression of Neural Networks
·2943 words·14 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 University of Alberta
NeuZip dynamically compresses neural network weights, achieving memory-efficient training and inference without performance loss, significantly reducing the memory footprint of large language models.
Do LLMs Have Distinct and Consistent Personality? TRAIT: Personality Testset designed for LLMs with Psychometrics
·11182 words·53 mins· loading · loading
AI Generated Natural Language Processing Large Language Models 🏒 Yonsei University
LLM의 κ°œμ„±μ„ μ •λŸ‰μ μœΌλ‘œ ν‰κ°€ν•˜λŠ” μƒˆλ‘œμš΄ 벀치마크 TRAIT μ œμ‹œ: μ‹ λ’°μ„± 및 타당성 높은 8,000개의 질문으둜 ꡬ성, LLM κ°œμ„±μ˜ λ…νŠΉμ„±κ³Ό 일관성 규λͺ…, λͺ¨λΈ μ •λ ¬ κ³Όμ •μ˜ 영ν–₯ 뢄석 및 μ œν•œμ  μ œμ‹œ.