↓Skip to main content

🏢 Seoul National University

Toward Robust Hyper-Detailed Image Captioning: A Multiagent Approach and Dual Evaluation Metrics for Factuality and Coverage

20 December 2024·2414 words·12 mins· loading · loading

AI Generated 🤗 Daily Papers Computer Vision Visual Question Answering 🏢 Seoul National University

초정밀 이미지 캡션 생성의 환각 문제 해결을 위해, LLM-MLLM 협업 기반의 다중 에이전트 시스템(CapMAS)을 제안하여 사실성과 포괄성을 높였습니다.

Background-aware Moment Detection for Video Moment Retrieval

5 June 2023·2175 words·11 mins· loading · loading

AI Generated Computer Vision Video Understanding 🏢 Seoul National University

BM-DETR: 배경 정보 활용으로 비디오 순간 검색의 약한 정렬 문제 해결!