Natural Language Processing
SepLLM: Accelerate Large Language Models by Compressing One Segment into One Separator
·2998 words·15 mins·
loading
·
loading
AI Generated
π€ Daily Papers
Natural Language Processing
Large Language Models
π’ Huawei Noah's Ark Lab
SepLLMμ νΉμ ν ν°μ μ€μμ±μ νμ©νμ¬ LLM μΆλ‘ μ κ°μννκ³ κΈ΄ μνμ€λ₯Ό ν¨μ¨μ μΌλ‘ μ²λ¦¬ν©λλ€.
RetroLLM: Empowering Large Language Models to Retrieve Fine-grained Evidence within Generation
·3747 words·18 mins·
loading
·
loading
AI Generated
π€ Daily Papers
Natural Language Processing
Question Answering
π’ Renmin University of China
RetroLLM: κ²μκ³Ό μμ±μ ν΅ν©ν RAG μμ€ν
Smaller Language Models Are Better Instruction Evolvers
·4310 words·21 mins·
loading
·
loading
AI Generated
π€ Daily Papers
Natural Language Processing
Large Language Models
π’ Beijing University of Posts and Telecommunications
μν μΈμ΄ λͺ¨λΈμ΄ λ λμ λͺ
λ Ή μμ±μ!
Reliable, Reproducible, and Really Fast Leaderboards with Evalica
·1243 words·6 mins·
loading
·
loading
AI Generated
π€ Daily Papers
Natural Language Processing
Large Language Models
π’ JetBrains
Evalica: λ²€μΉλ§νΉμ μ½κ³ λΉ λ₯΄κ³ μ λ’°ν μ μκ² λ§λλ ν΄ν·
SCBench: A KV Cache-Centric Analysis of Long-Context Methods
·4642 words·22 mins·
loading
·
loading
AI Generated
π€ Daily Papers
Natural Language Processing
Large Language Models
π’ Microsoft Corporation
SCBenchλ λ©ν°ν΄ λ° λ©ν°λ¦¬νμ€νΈ μλ리μ€μμ μ₯λ¬Έ λ§₯λ½ λ©μλλ₯Ό νκ°νλ μλ‘μ΄ λ²€μΉλ§ν¬μ
λλ€.
Large Action Models: From Inception to Implementation
·2067 words·10 mins·
loading
·
loading
AI Generated
π€ Daily Papers
Natural Language Processing
Large Language Models
π’ Microsoft
LLMμμ LAMμΌλ‘: μ€μ μμ
μ μννλ AI μμ΄μ νΈ κ΅¬μΆ.
Byte Latent Transformer: Patches Scale Better Than Tokens
·3839 words·19 mins·
loading
·
loading
AI Generated
π€ Daily Papers
Natural Language Processing
Large Language Models
π’ University of Washington
BLT: λ°μ΄νΈ κΈ°λ° LLM, ν ν°λ³΄λ€ ν¨μΉ μ°μ .
Phi-4 Technical Report
·2236 words·11 mins·
loading
·
loading
AI Generated
π€ Daily Papers
Natural Language Processing
Large Language Models
π’ Microsoft Research
Phi-4: 140μ΅ λ§€κ°λ³μ μΈμ΄ λͺ¨λΈμ λ°μ΄ν° νμ§μ μ€μ μ λ νλ ¨ λ μνΌλ‘ κ°λ°λμ΄ μΆλ‘ λ₯λ ₯μ λν ν₯μμμΌ°μ΅λλ€.
GReaTer: Gradients over Reasoning Makes Smaller Language Models Strong Prompt Optimizers
·7101 words·34 mins·
loading
·
loading
AI Generated
π€ Daily Papers
Natural Language Processing
Large Language Models
π’ Pennsylvania State University
GREATERλ μΆλ‘ μ λν κ·Έλ μ΄λμΈνΈλ₯Ό νμ©νμ¬ μκ·λͺ¨ μΈμ΄ λͺ¨λΈμ ν둬ννΈλ₯Ό μ΅μ ννμ¬ λκ·λͺ¨ LLM μμ΄λ μ±λ₯μ ν₯μμν΅λλ€.
SmolTulu: Higher Learning Rate to Batch Size Ratios Can Lead to Better Reasoning in SLMs
·2378 words·12 mins·
loading
·
loading
AI Generated
π€ Daily Papers
Natural Language Processing
Large Language Models
π’ Saudi Data & Artificial Intelligence Authority
Smaller language models reason better with fine-tuned training recipes.
NeuZip: Memory-Efficient Training and Inference with Dynamic Compression of Neural Networks
·2943 words·14 mins·
loading
·
loading
AI Generated
π€ Daily Papers
Natural Language Processing
Large Language Models
π’ University of Alberta
NeuZip dynamically compresses neural network weights, achieving memory-efficient training and inference without performance loss, significantly reducing the memory footprint of large language models.
Do LLMs Have Distinct and Consistent Personality? TRAIT: Personality Testset designed for LLMs with Psychometrics
·11182 words·53 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
π’ Yonsei University
LLMμ κ°μ±μ μ λμ μΌλ‘ νκ°νλ μλ‘μ΄ λ²€μΉλ§ν¬ TRAIT μ μ: μ λ’°μ± λ° νλΉμ± λμ 8,000κ°μ μ§λ¬ΈμΌλ‘ ꡬμ±, LLM κ°μ±μ λ
νΉμ±κ³Ό μΌκ΄μ± κ·λͺ
, λͺ¨λΈ μ λ ¬ κ³Όμ μ μν₯ λΆμ λ° μ νμ μ μ.