Skip to main content

🏒 Zhejiang University

VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM
·3245 words·16 mins· loading · loading
AI Generated πŸ€— Daily Papers Multimodal Learning Vision-Language Models 🏒 Zhejiang University
VideoRefer SuiteλŠ” μ •κ΅ν•œ 곡간-μ‹œκ°„μ  개체 이해λ₯Ό μœ„ν•œ μƒˆλ‘œμš΄ λΉ„λ””μ˜€ LLM(VideoRefer)κ³Ό λŒ€κ·œλͺ¨ κ³ ν’ˆμ§ˆ 데이터셋(VideoRefer-700K), 쒅합적인 벀치마크(VideoRefer-Bench)λ₯Ό μ œμ‹œν•©λ‹ˆλ‹€.
OneKE: A Dockerized Schema-Guided LLM Agent-based Knowledge Extraction System
·304 words·2 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Information Extraction 🏒 Zhejiang University
OneKE: 도컀 기반, 닀쀑 μ—μ΄μ „νŠΈ LLM 지식 μΆ”μΆœ μ‹œμŠ€ν…œμœΌλ‘œ μ›Ή, PDFμ—μ„œ λ‹€μ–‘ν•œ 도메인 지식 μΆ”μΆœ κ°€λŠ₯
Orient Anything: Learning Robust Object Orientation Estimation from Rendering 3D Models
·2368 words·12 mins· loading · loading
AI Generated πŸ€— Daily Papers Computer Vision 3D Vision 🏒 Zhejiang University
단일 μ΄λ―Έμ§€μ—μ„œ 객체 λ°©ν–₯ μΆ”μ •μ˜ 정확도λ₯Ό 크게 λ†’μ΄λŠ” ‘Orient Anything’ λͺ¨λΈ μ œμ‹œ!
Prompting Depth Anything for 4K Resolution Accurate Metric Depth Estimation
·3901 words·19 mins· loading · loading
AI Generated πŸ€— Daily Papers Computer Vision 3D Vision 🏒 Zhejiang University
μ €λ ΄ν•œ 라이닀 ν”„λ‘¬ν”„νŠΈλ₯Ό μ‚¬μš©ν•œ 4K 고해상도 μ •ν™•ν•œ κ³„λŸ‰μ  깊이 좔정을 μœ„ν•œ μƒˆλ‘œμš΄ νŒ¨λŸ¬λ‹€μž„, Prompt Depth Anything μ œμ‹œ!