↓Skip to main content

🏢 University of Texas at Austin

Rethinking Addressing in Language Models via Contexualized Equivariant Positional Encoding

1 January 2025·3211 words·16 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 University of Texas at Austin

TAPE(conTextualized equivAriant Position Embedding) 프레임워크를 통해 문맥 정보를 활용한 동적 위치 인코딩으로 트랜스포머의 위치 기반 주소 지정 성능을 향상시켰습니다.

Understanding and Mitigating Bottlenecks of State Space Models through the Lens of Recency and Over-smoothing

31 December 2024·2638 words·13 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 University of Texas at Austin

심층 신경망의 장기 의존성을 모델링하는 구조적 상태 공간 모델(SSM)의 한계를 극복! 최신 연구에서 SSM의 최근 편향(recency bias) 및 과도한 평활화(over-smoothing) 문제를 규명하고, 이를 해결하는 **극성화 기법(polarization)**을 제시하여 장기 토큰 상관관계 정확도를 높였습니다.