Skip to main content

🏒 Seoul National University

Toward Robust Hyper-Detailed Image Captioning: A Multiagent Approach and Dual Evaluation Metrics for Factuality and Coverage
·2414 words·12 mins· loading · loading
AI Generated πŸ€— Daily Papers Computer Vision Visual Question Answering 🏒 Seoul National University
μ΄ˆμ •λ°€ 이미지 μΊ‘μ…˜ μƒμ„±μ˜ ν™˜κ° 문제 해결을 μœ„ν•΄, LLM-MLLM ν˜‘μ—… 기반의 닀쀑 μ—μ΄μ „νŠΈ μ‹œμŠ€ν…œ(CapMAS)을 μ œμ•ˆν•˜μ—¬ 사싀성과 포괄성을 λ†’μ˜€μŠ΅λ‹ˆλ‹€.
Background-aware Moment Detection for Video Moment Retrieval
·2175 words·11 mins· loading · loading
AI Generated Computer Vision Video Understanding 🏒 Seoul National University
BM-DETR: λ°°κ²½ 정보 ν™œμš©μœΌλ‘œ λΉ„λ””μ˜€ μˆœκ°„ κ²€μƒ‰μ˜ μ•½ν•œ μ •λ ¬ 문제 ν•΄κ²°!