↓Skip to main content

🏢 Zhejiang University

VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM

31 December 2024·3245 words·16 mins· loading · loading

AI Generated 🤗 Daily Papers Multimodal Learning Vision-Language Models 🏢 Zhejiang University

VideoRefer Suite는 정교한 공간-시간적 개체 이해를 위한 새로운 비디오 LLM(VideoRefer)과 대규모 고품질 데이터셋(VideoRefer-700K), 종합적인 벤치마크(VideoRefer-Bench)를 제시합니다.

OneKE: A Dockerized Schema-Guided LLM Agent-based Knowledge Extraction System

28 December 2024·304 words·2 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Information Extraction 🏢 Zhejiang University

OneKE: 도커 기반, 다중 에이전트 LLM 지식 추출 시스템으로 웹, PDF에서 다양한 도메인 지식 추출 가능

Orient Anything: Learning Robust Object Orientation Estimation from Rendering 3D Models

24 December 2024·2368 words·12 mins· loading · loading

AI Generated 🤗 Daily Papers Computer Vision 3D Vision 🏢 Zhejiang University

단일 이미지에서 객체 방향 추정의 정확도를 크게 높이는 ‘Orient Anything’ 모델 제시!

Prompting Depth Anything for 4K Resolution Accurate Metric Depth Estimation

18 December 2024·3901 words·19 mins· loading · loading

AI Generated 🤗 Daily Papers Computer Vision 3D Vision 🏢 Zhejiang University

저렴한 라이다 프롬프트를 사용한 4K 고해상도 정확한 계량적 깊이 추정을 위한 새로운 패러다임, Prompt Depth Anything 제시!