π’ Seoul National University
Toward Robust Hyper-Detailed Image Captioning: A Multiagent Approach and Dual Evaluation Metrics for Factuality and Coverage
·2414 words·12 mins·
loading
·
loading
AI Generated
π€ Daily Papers
Computer Vision
Visual Question Answering
π’ Seoul National University
μ΄μ λ° μ΄λ―Έμ§ μΊ‘μ
μμ±μ νκ° λ¬Έμ ν΄κ²°μ μν΄, LLM-MLLM νμ
κΈ°λ°μ λ€μ€ μμ΄μ νΈ μμ€ν
(CapMAS)μ μ μνμ¬ μ¬μ€μ±κ³Ό ν¬κ΄μ±μ λμμ΅λλ€.
Background-aware Moment Detection for Video Moment Retrieval
·2175 words·11 mins·
loading
·
loading
AI Generated
Computer Vision
Video Understanding
π’ Seoul National University
BM-DETR: λ°°κ²½ μ 보 νμ©μΌλ‘ λΉλμ€ μκ° κ²μμ μ½ν μ λ ¬ λ¬Έμ ν΄κ²°!