Skip to main content

Large Language Models

BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoning
·2104 words·10 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 Shanghai AI Laboratory
BoostStep: 단계별 μΆ”λ‘ μœΌλ‘œ LLMs의 μˆ˜ν•™μ  λŠ₯λ ₯ ν–₯상!
ToolHop: A Query-Driven Benchmark for Evaluating Large Language Models in Multi-Hop Tool Use
·3178 words·15 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 ByteDance
ToolHop: λŒ€κ·œλͺ¨ μ–Έμ–΄ λͺ¨λΈμ˜ 닀쀑 단계 도ꡬ μ‚¬μš© λŠ₯λ ₯을 μ—„κ²©νžˆ ν‰κ°€ν•˜λŠ” μƒˆλ‘œμš΄ 벀치마크
Test-time Computing: from System-1 Thinking to System-2 Thinking
·699 words·4 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 Soochow University
ν…ŒμŠ€νŠΈ μ‹œκ°„ μ»΄ν“¨νŒ…μ„ ν™œμš©ν•˜μ—¬ λŒ€κ·œλͺ¨ μ–Έμ–΄ λͺ¨λΈμ˜ μΆ”λ‘  λŠ₯λ ₯을 μ‹œμŠ€ν…œ 1 μ‚¬κ³ μ—μ„œ μ‹œμŠ€ν…œ 2 사고 μˆ˜μ€€μœΌλ‘œ ν–₯μƒμ‹œν‚€λŠ” 방법을 μ œμ‹œν•˜λŠ” 획기적인 연ꡬ!
Scaling Laws for Floating Point Quantization Training
·5642 words·27 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 Tencent AI Lab
뢀동 μ†Œμˆ˜μ  μ–‘μžν™” ν›ˆλ ¨μ˜ μƒˆλ‘œμš΄ scaling law 발견: μ§€μˆ˜, 맨티사 λΉ„νŠΈ 및 μŠ€μΌ€μΌλ§ 인자 계산 정밀도가 LLM μ„±λŠ₯에 λ―ΈμΉ˜λŠ” 영ν–₯을 μ •λŸ‰μ μœΌλ‘œ 규λͺ…
Personalized Graph-Based Retrieval for Large Language Models
·3060 words·15 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 UC Santa Cruz
κ°œμΈν™”λœ κ·Έλž˜ν”„ 기반 검색 증강 생성(PGraphRAG) ν”„λ ˆμž„μ›Œν¬λ₯Ό 톡해 ν¬μ†Œ 데이터 문제λ₯Ό ν•΄κ²°ν•˜κ³ , LLM의 κ°œμΈν™” μ„±λŠ₯을 크게 ν–₯μƒμ‹œμΌ°μŠ΅λ‹ˆλ‹€.
METAGENE-1: Metagenomic Foundation Model for Pandemic Monitoring
·2684 words·13 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 University of Southern California
70μ–΅ 개 λ§€κ°œλ³€μˆ˜λ₯Ό 가진 λ©”νƒ€μœ μ „μ²΄ 기반 λŒ€κ·œλͺ¨ μ–Έμ–΄ λͺ¨λΈ(METAGENE-1)이 폐수 λ°μ΄ν„°λ‘œ ν›ˆλ ¨λ˜μ–΄ 병원균 탐지 및 μœ μ „μ²΄ μ„œμ—΄ μž„λ² λ”© μž‘μ—…μ—μ„œ μ΅œμ²¨λ‹¨ μ„±λŠ₯을 λ‹¬μ„±ν–ˆμŠ΅λ‹ˆλ‹€.
Auto-RT: Automatic Jailbreak Strategy Exploration for Red-Teaming Large Language Models
·3175 words·15 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 Ant Group
AUTO-RT: μžλ™ν™”λœ 재밍 μ „λž΅ νƒμƒ‰μœΌλ‘œ LLM 취약점 효율적으둜 발견!
Dynamic Scaling of Unit Tests for Code Reward Modeling
·2368 words·12 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 Tsinghua University
λ‹¨μœ„ ν…ŒμŠ€νŠΈμ˜ 수λ₯Ό 늘렀 μ½”λ“œ 보상 λͺ¨λΈμ˜ 정확성을 λ†’μ΄λŠ” 방법을 μ œμ‹œν•˜λŠ” 연ꡬ!
CodeElo: Benchmarking Competition-level Code Generation of LLMs with Human-comparable Elo Ratings
·1888 words·9 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 Alibaba Group
CODEELO 벀치마크: 인간 μˆ˜μ€€μ˜ Elo λ“±κΈ‰μœΌλ‘œ LLM의 경쟁적 μ½”λ“œ 생성 λŠ₯λ ₯ 평가
BoxingGym: Benchmarking Progress in Automated Experimental Design and Model Discovery
·3521 words·17 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 Stanford University
BoxingGym: LLM 기반 과학적 μ—μ΄μ „νŠΈμ˜ μ‹€ν—˜ 섀계 및 λͺ¨λΈ 발견 λŠ₯λ ₯ μ’…ν•© 평가 벀치마크
Rethinking Addressing in Language Models via Contexualized Equivariant Positional Encoding
·3211 words·16 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 University of Texas at Austin
TAPE(conTextualized equivAriant Position Embedding) ν”„λ ˆμž„μ›Œν¬λ₯Ό 톡해 λ¬Έλ§₯ 정보λ₯Ό ν™œμš©ν•œ 동적 μœ„μΉ˜ μΈμ½”λ”©μœΌλ‘œ 트랜슀포머의 μœ„μΉ˜ 기반 μ£Όμ†Œ 지정 μ„±λŠ₯을 ν–₯μƒμ‹œμΌ°μŠ΅λ‹ˆλ‹€.
Understanding and Mitigating Bottlenecks of State Space Models through the Lens of Recency and Over-smoothing
·2638 words·13 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 University of Texas at Austin
심측 μ‹ κ²½λ§μ˜ μž₯κΈ° μ˜μ‘΄μ„±μ„ λͺ¨λΈλ§ν•˜λŠ” ꡬ쑰적 μƒνƒœ 곡간 λͺ¨λΈ(SSM)의 ν•œκ³„λ₯Ό 극볡! μ΅œμ‹  μ—°κ΅¬μ—μ„œ SSM의 졜근 편ν–₯(recency bias) 및 κ³Όλ„ν•œ ν‰ν™œν™”(over-smoothing) 문제λ₯Ό 규λͺ…ν•˜κ³ , 이λ₯Ό ν•΄κ²°ν•˜λŠ” **κ·Ήμ„±ν™” 기법(polarization)**을 μ œμ‹œν•˜μ—¬ μž₯κΈ° 토큰 상관관계 정확도λ₯Ό λ†’μ˜€μŠ΅λ‹ˆλ‹€.
HUNYUANPROVER: A Scalable Data Synthesis Framework and Guided Tree Search for Automated Theorem Proving
·1341 words·7 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 Tencent AI Lab
HunyuanProver: λŒ€κ·œλͺ¨ μ–Έμ–΄ λͺ¨λΈ 기반의 ν™•μž₯ κ°€λŠ₯ν•œ 데이터 ν•©μ„± ν”„λ ˆμž„μ›Œν¬μ™€ μ•ˆλ‚΄ 트리 탐색을 톡해 μ΅œμ²¨λ‹¨ μžλ™ 정리 증λͺ… μ„±λŠ₯ 달성!
HumanEval Pro and MBPP Pro: Evaluating Large Language Models on Self-invoking Code Generation
·3353 words·16 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 Tsinghua University
LLM의 점진적 μΆ”λ‘  및 문제 ν•΄κ²° λŠ₯λ ₯을 ν‰κ°€ν•˜κΈ° μœ„ν•œ μƒˆλ‘œμš΄ 벀치마크 HumanEval Pro, MBPP Pro, BigCodeBench-Lite Pro μ œμ‹œ!
Facilitating large language model Russian adaptation with Learned Embedding Propagation
·1947 words·10 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 Lomonosov Moscow State University
LEP(Learned Embedding Propagation)λŠ” 적은 μ–‘μ˜ ν•™μŠ΅ λ°μ΄ν„°λ§ŒμœΌλ‘œλ„ λ‹€κ΅­μ–΄ λŒ€κ·œλͺ¨ μ–Έμ–΄ λͺ¨λΈμ„ 효율적으둜 μ μ‘μ‹œν‚€λŠ” μƒˆλ‘œμš΄ κΈ°λ²•μž…λ‹ˆλ‹€.
Efficiently Serving LLM Reasoning Programs with Certaindex
·3238 words·16 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 UC San Diego
Dynasor은 LLM μΆ”λ‘  ν”„λ‘œκ·Έλž¨μ˜ μžμ› μ‚¬μš©μ„ μ΅œμ ν™”ν•˜λŠ” μ‹œμŠ€ν…œμœΌλ‘œ, certaindexλΌλŠ” μƒˆλ‘œμš΄ μ§€ν‘œλ₯Ό ν™œμš©ν•˜μ—¬ μ–΄λ €μš΄ μ§ˆμ˜μ—λŠ” 더 λ§Žμ€ 연산을, κ°„λ‹¨ν•œ μ§ˆμ˜μ—λŠ” 적은 연산을 ν• λ‹Ήν•˜κ³ , 전망이 μ—†λŠ” μ§ˆμ˜λŠ” 쑰기에 μ’…λ£Œν•¨μœΌλ‘œμ¨ 정확도, 지연 μ‹œκ°„ 및 λΉ„μš©μ„ κ· ν˜• 있게 맞μΆ₯λ‹ˆλ‹€.
Do NOT Think That Much for 2+3=? On the Overthinking of o1-Like LLMs
·2075 words·10 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 Tencent AI Lab
λŒ€κ·œλͺ¨ μ–Έμ–΄ λͺ¨λΈμ˜ κ³Όλ„ν•œ μ—°μ‚° 문제 ν•΄κ²°: 효율적인 좔둠을 μœ„ν•œ μƒˆλ‘œμš΄ μ§€ν‘œ 및 자기 ν•™μŠ΅ μ „λž΅ μ œμ‹œ
Xmodel-2 Technical Report
·2136 words·11 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 Xiaoduo AI Lab
Xmodel-2: 12μ–΅ λ§€κ°œλ³€μˆ˜μ˜ μΆ”λ‘  μ „λ¬Έ λŒ€κ·œλͺ¨ μ–Έμ–΄ λͺ¨λΈλ‘œ, 효율적인 섀계와 ν›ˆλ ¨ μ „λž΅μ„ 톡해 μ΅œμ²¨λ‹¨ μ„±λŠ₯ 달성!
Safeguard Fine-Tuned LLMs Through Pre- and Post-Tuning Model Merging
·177 words·1 min· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 Intel Labs
λ―Έμ„Έ μ‘°μ •μœΌλ‘œ μ•ˆμ „μ„±μ΄ μ €ν•˜λœ LLM의 μ„±λŠ₯을 ν–₯μƒμ‹œν‚€λŠ” λ™μ‹œμ— μ•ˆμ „μ„±μ„ μœ μ§€ν•˜λŠ” κ°„νŽΈν•˜κ³  효과적인 λͺ¨λΈ κ²°ν•© 방법 μ œμ‹œ!
Token-Budget-Aware LLM Reasoning
·2417 words·12 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 Nanjing University
토큰 μ˜ˆμ‚° 인식 LLM μΆ”λ‘  ν”„λ ˆμž„μ›Œν¬(TALE)λ₯Ό 톡해 LLM μΆ”λ‘ μ˜ 토큰 λΉ„μš©μ„ 크게 μ€„μ΄λ©΄μ„œ μ„±λŠ₯ μ €ν•˜λ₯Ό μ΅œμ†Œν™”ν–ˆμŠ΅λ‹ˆλ‹€!