๐ข University of Texas at Austin
Rethinking Addressing in Language Models via Contexualized Equivariant Positional Encoding
·3211 words·16 mins·
loading
·
loading
AI Generated
๐ค Daily Papers
Natural Language Processing
Large Language Models
๐ข University of Texas at Austin
TAPE(conTextualized equivAriant Position Embedding) ํ๋ ์์ํฌ๋ฅผ ํตํด ๋ฌธ๋งฅ ์ ๋ณด๋ฅผ ํ์ฉํ ๋์ ์์น ์ธ์ฝ๋ฉ์ผ๋ก ํธ๋์คํฌ๋จธ์ ์์น ๊ธฐ๋ฐ ์ฃผ์ ์ง์ ์ฑ๋ฅ์ ํฅ์์์ผฐ์ต๋๋ค.
Understanding and Mitigating Bottlenecks of State Space Models through the Lens of Recency and Over-smoothing
·2638 words·13 mins·
loading
·
loading
AI Generated
๐ค Daily Papers
Natural Language Processing
Large Language Models
๐ข University of Texas at Austin
์ฌ์ธต ์ ๊ฒฝ๋ง์ ์ฅ๊ธฐ ์์กด์ฑ์ ๋ชจ๋ธ๋งํ๋ ๊ตฌ์กฐ์ ์ํ ๊ณต๊ฐ ๋ชจ๋ธ(SSM)์ ํ๊ณ๋ฅผ ๊ทน๋ณต! ์ต์ ์ฐ๊ตฌ์์ SSM์ ์ต๊ทผ ํธํฅ(recency bias) ๋ฐ ๊ณผ๋ํ ํํํ(over-smoothing) ๋ฌธ์ ๋ฅผ ๊ท๋ช
ํ๊ณ , ์ด๋ฅผ ํด๊ฒฐํ๋ **๊ทน์ฑํ ๊ธฐ๋ฒ(polarization)**์ ์ ์ํ์ฌ ์ฅ๊ธฐ ํ ํฐ ์๊ด๊ด๊ณ ์ ํ๋๋ฅผ ๋์์ต๋๋ค.