Skip to main content

🏒 College of Computer Science and Technology, Zhejiang University

2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining
·3272 words·16 mins· loading · loading
AI Generated πŸ€— Daily Papers Multimodal Learning Vision-Language Models 🏒 College of Computer Science and Technology, Zhejiang University
2.5λ…„ λΆ„λŸ‰μ˜ ꡐ윑 λΉ„λ””μ˜€λ₯Ό ν™œμš©, κ³ ν’ˆμ§ˆ 닀쀑 λͺ¨λ‹¬ ν…μŠ€νŠΈλΆ μ½”νΌμŠ€ ꡬ좕 및 VLMs 사전 ν•™μŠ΅ μ„±λŠ₯ ν–₯상