โ†“Skip to main content

๐Ÿข Stanford University

Automated Generation of Challenging Multiple-Choice Questions for Vision Language Model Evaluation
ยท4797 wordsยท23 minsยท loading ยท loading
AI Generated ๐Ÿค— Daily Papers Natural Language Processing Question Answering ๐Ÿข Stanford University
AutoConverter๋Š” ์˜คํ”ˆ์—”๋“œ ๋ฐฉ์‹์˜ VQA ์งˆ๋ฌธ์„ ๋‹ค์ง€์„ ๋‹คํ˜• ์งˆ๋ฌธ์œผ๋กœ ์ž๋™ ๋ณ€ํ™˜ํ•˜๋Š” ์‹œ์Šคํ…œ์ž…๋‹ˆ๋‹ค. ์ด๋ฅผ ํ†ตํ•ด VLM(Vision Language Model) ํ‰๊ฐ€์˜ ๊ฐ๊ด€์„ฑ๊ณผ ์žฌํ˜„์„ฑ์„ ๋†’์ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์—ฐ๊ตฌ์ง„์€ AutoConverter๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ 20๊ฐœ์˜ ๊ธฐ์กด VQA ๋ฐ์ดํ„ฐ์…‹์„ ํ†ตํ•ฉํ•œ VMCBench๋ผ๋Š” ์ƒˆ๋กœ์šด ๋ฒค์น˜๋งˆํฌ๋ฅผ ๊ตฌ์ถ•ํ–ˆ์Šต๋‹ˆ๋‹ค. VMCBenโ€ฆ
BoxingGym: Benchmarking Progress in Automated Experimental Design and Model Discovery
ยท3521 wordsยท17 minsยท loading ยท loading
AI Generated ๐Ÿค— Daily Papers Natural Language Processing Large Language Models ๐Ÿข Stanford University
BoxingGym: LLM ๊ธฐ๋ฐ˜ ๊ณผํ•™์  ์—์ด์ „ํŠธ์˜ ์‹คํ—˜ ์„ค๊ณ„ ๋ฐ ๋ชจ๋ธ ๋ฐœ๊ฒฌ ๋Šฅ๋ ฅ ์ข…ํ•ฉ ํ‰๊ฐ€ ๋ฒค์น˜๋งˆํฌ
Thinking in Space: How Multimodal Large Language Models See, Remember, and Recall Spaces
ยท4794 wordsยท23 minsยท loading ยท loading
AI Generated ๐Ÿค— Daily Papers Computer Vision Visual Question Answering ๐Ÿข Stanford University
MLLM์˜ ์‹œ๊ฐ-๊ณต๊ฐ„ ์ง€๋Šฅ ํ–ฅ์ƒ์— ๋„์›€์ด ๋˜๋Š” ์ƒˆ๋กœ์šด ๋น„๋””์˜ค ๊ธฐ๋ฐ˜ ๋ฒค์น˜๋งˆํฌ VSI-Bench ๋ฐœํ‘œ!
Whisper-GPT: A Hybrid Representation Audio Large Language Model
ยท1322 wordsยท7 minsยท loading ยท loading
AI Generated ๐Ÿค— Daily Papers Natural Language Processing Large Language Models ๐Ÿข Stanford University
Whisper-GPT: ํ•˜์ด๋ธŒ๋ฆฌ๋“œ ์Œ์„ฑ ๋ฐ ์Œ์•… LLM์œผ๋กœ, ์—ฐ์† ์˜ค๋””์˜ค์™€ ์ด์‚ฐ ํ† ํฐ์„ ๊ฒฐํ•ฉํ•˜์—ฌ ํ–ฅ์ƒ๋œ ์„ฑ๋Šฅ์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.