Skip to main content

Computer Vision

AniDoc: Animation Creation Made Easier
·1844 words·9 mins· loading · loading
AI Generated ๐Ÿค— Daily Papers Computer Vision Video Understanding ๐Ÿข Hong Kong University of Science and Technology
AniDoc: ํฌ์†Œ ์Šค์ผ€์น˜์™€ ์ฐธ์กฐ ์ด๋ฏธ์ง€๋ฅผ ํ™œ์šฉ, 2D ์• ๋‹ˆ๋ฉ”์ด์…˜ ์ž๋™ ์ฑ„์ƒ‰ ๋ฐ ๋ณด๊ฐ„์„ ๊ตฌํ˜„ํ•˜๋Š” ํ˜์‹ ์  AI ๋ชจ๋ธ!
VidTok: A Versatile and Open-Source Video Tokenizer
·2469 words·12 mins· loading · loading
AI Generated ๐Ÿค— Daily Papers Computer Vision Video Understanding ๐Ÿข Microsoft Research
VidTok: ์˜คํ”ˆ์†Œ์Šค ๊ณ ์„ฑ๋Šฅ ๋น„๋””์˜ค ํ† ํฌ๋‚˜์ด์ €๊ฐ€ ์—ฐ์† ๋ฐ ์ด์‚ฐ ํ† ํฐํ™”์—์„œ ์ตœ์ฒจ๋‹จ ์„ฑ๋Šฅ์„ ๋‹ฌ์„ฑํ•˜๋ฉฐ, ํšจ์œจ์ ์ธ ํ•™์Šต ์ „๋žต๊ณผ ํ˜์‹ ์ ์ธ ์–‘์žํ™” ๊ธฐ๋ฒ•์„ ํ†ตํ•ด ์˜์ƒ ์ƒ์„ฑ ๋ฐ ์ดํ•ด ์—ฐ๊ตฌ์— ์ƒˆ๋กœ์šด ๊ฐ€๋Šฅ์„ฑ์„ ์—ด์—ˆ์Šต๋‹ˆ๋‹ค.
Move-in-2D: 2D-Conditioned Human Motion Generation
·1943 words·10 mins· loading · loading
AI Generated ๐Ÿค— Daily Papers Computer Vision Video Understanding ๐Ÿข Adobe Research
Move-in-2D: 2D ์ด๋ฏธ์ง€์™€ ํ…์ŠคํŠธ ํ”„๋กฌํ”„ํŠธ๋กœ ํ˜„์‹ค์ ์ธ ์ธ๊ฐ„ ๋™์ž‘ ์ƒ์„ฑ
ChatDiT: A Training-Free Baseline for Task-Agnostic Free-Form Chatting with Diffusion Transformers
·1484 words·7 mins· loading · loading
AI Generated ๐Ÿค— Daily Papers Computer Vision Image Generation ๐Ÿข Tongyi Lab
ChatDiT: ์ œ๋กœ์ƒท ๋ฐฉ์‹์œผ๋กœ ์‚ฌ์ „ ํ›ˆ๋ จ๋œ ํ™•์‚ฐ ๋ณ€ํ™˜๊ธฐ๋ฅผ ํ™œ์šฉ, ์ž์—ฐ์–ด๋กœ ๋‹ค์–‘ํ•œ ์‹œ๊ฐ์  ๊ณผ์ œ ํ•ด๊ฒฐ!
Wonderland: Navigating 3D Scenes from a Single Image
·2841 words·14 mins· loading · loading
AI Generated ๐Ÿค— Daily Papers Computer Vision 3D Vision ๐Ÿข University of Toronto
๋‹จ์ผ ์ด๋ฏธ์ง€๋กœ ๊ณ ํ’ˆ์งˆ 3D ์žฅ๋ฉด์„ ์ƒ์„ฑํ•˜๋Š” ํšจ์œจ์ ์ด๊ณ  ํ™•์žฅ ๊ฐ€๋Šฅํ•œ ํ”„๋ ˆ์ž„์›Œํฌ
StrandHead: Text to Strand-Disentangled 3D Head Avatars Using Hair Geometric Priors
·1741 words·9 mins· loading · loading
AI Generated ๐Ÿค— Daily Papers Computer Vision 3D Vision ๐Ÿข Nanjing University
’’ StrandHead: ํ…์ŠคํŠธ๋งŒ์œผ๋กœ ์‚ฌ์‹ค์ ์ธ 3D ํ—ค๋“œ ์•„๋ฐ”ํƒ€์™€ ์„ฌ์„ธํ•œ ํ—ค์–ด์Šคํƒ€์ผ๊นŒ์ง€ ์ƒ์„ฑ.''
Sequence Matters: Harnessing Video Models in 3D Super-Resolution
·3903 words·19 mins· loading · loading
AI Generated ๐Ÿค— Daily Papers Computer Vision 3D Vision ๐Ÿข Department of Electrical and Computer Engineering, Sungkyunkwan University
๋น„๋””์˜ค ์ดˆํ•ด์ƒ๋„ ๋ชจ๋ธ์„ ์ด์šฉํ•œ ํ˜์‹ ์ ์ธ 3D ์ดˆํ•ด์ƒ๋„ ๊ธฐ๋ฒ•์œผ๋กœ, ์ •๋ ฌ ๊ณผ์ • ์—†์ด๋„ ์ตœ์ฒจ๋‹จ ์„ฑ๋Šฅ ๋‹ฌ์„ฑ!
Nearly Zero-Cost Protection Against Mimicry by Personalized Diffusion Models
·3489 words·17 mins· loading · loading
AI Generated ๐Ÿค— Daily Papers Computer Vision Image Generation ๐Ÿข Inha University
์‹ค์‹œ๊ฐ„ ์ด๋ฏธ์ง€ ๋ณดํ˜ธ, ๋”ฅํŽ˜์ดํฌ ๋Œ€๋น„์ฑ….
MOVIS: Enhancing Multi-Object Novel View Synthesis for Indoor Scenes
·2949 words·14 mins· loading · loading
AI Generated ๐Ÿค— Daily Papers Computer Vision 3D Vision ๐Ÿข Peking University
MOVIS๋Š” ์‹ค๋‚ด ์žฅ๋ฉด์— ๋Œ€ํ•œ ๋ฉ€ํ‹ฐ-๊ฐ์ฒด novel view synthesis์—์„œ ๊ตฌ์กฐ์  ์ธ์‹์„ ํ–ฅ์ƒ์‹œ์ผœ ์ผ๊ด€์„ฑ ์žˆ๊ณ  ์‚ฌ์‹ค์ ์ธ novel view๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค.
IDArb: Intrinsic Decomposition for Arbitrary Number of Input Views and Illuminations
·3273 words·16 mins· loading · loading
AI Generated ๐Ÿค— Daily Papers Computer Vision 3D Vision ๐Ÿข Chinese University of Hong Kong
IDArb: Decomposition under varied lights.
ColorFlow: Retrieval-Augmented Image Sequence Colorization
·2273 words·11 mins· loading · loading
AI Generated ๐Ÿค— Daily Papers Computer Vision Image Generation ๐Ÿข Tsinghua University
๋งŒํ™” ์ฑ„์ƒ‰ ์ž๋™ํ™”: ColorFlow๋Š” ID ์ผ๊ด€์„ฑ์„ ์œ ์ง€ํ•˜๋ฉด์„œ ํ‘๋ฐฑ ๋งŒํ™” ์‹œํ€€์Šค๋ฅผ ์ฑ„์ƒ‰ํ•ฉ๋‹ˆ๋‹ค.
Causal Diffusion Transformers for Generative Modeling
·4953 words·24 mins· loading · loading
AI Generated ๐Ÿค— Daily Papers Computer Vision Image Generation ๐Ÿข ByteDance Research
CausalFusion์€ ํ™•์‚ฐ ๋ฐ ์ž๊ธฐ ํšŒ๊ท€ ๋ชจ๋ธ์„ ๊ฒฐํ•ฉํ•˜์—ฌ ์ƒ์„ฑ ๋ชจ๋ธ๋ง์—์„œ ์ตœ์ฒจ๋‹จ ๊ฒฐ๊ณผ๋ฅผ ๋‹ฌ์„ฑํ•˜๊ณ  ์ƒˆ๋กœ์šด ๊ธฐ๋Šฅ์„ ๊ฐ€๋Šฅํ•˜๊ฒŒ ํ•ฉ๋‹ˆ๋‹ค.
VividFace: A Diffusion-Based Hybrid Framework for High-Fidelity Video Face Swapping
·1707 words·9 mins· loading · loading
AI Generated ๐Ÿค— Daily Papers Computer Vision Image Generation ๐Ÿข CUHK MMLab
VividFace: ์ฒซ ๋ฒˆ์งธ ํ™•์‚ฐ ๊ธฐ๋ฐ˜ ๋น„๋””์˜ค ์–ผ๊ตด ๋ฐ”๊พธ๊ธฐ ํ”„๋ ˆ์ž„์›Œํฌ๋กœ ๊ณ ์ถฉ์‹ค๋„ ๊ฒฐ๊ณผ ์ œ๊ณต.
GaussianProperty: Integrating Physical Properties to 3D Gaussians with LMMs
·2657 words·13 mins· loading · loading
AI Generated ๐Ÿค— Daily Papers Computer Vision 3D Vision ๐Ÿข Hong Kong University of Science and Technology
GaussianProperty๋Š” LMM์„ ์‚ฌ์šฉํ•˜์—ฌ 3D ๊ฐ€์šฐ์‹œ์•ˆ์— ๋ฌผ๋ฆฌ์  ์†์„ฑ์„ ํ†ตํ•ฉํ•˜๋Š” ํ›ˆ๋ จ ์—†๋Š” ํ”„๋ ˆ์ž„์›Œํฌ๋กœ, ๋ฌผ๋ฆฌ ๊ธฐ๋ฐ˜ ์‹œ๋ฎฌ๋ ˆ์ด์…˜ ๋ฐ ๋กœ๋ด‡ ์ฅ๊ธฐ์™€ ๊ฐ™์€ ๋‹ค์šด์ŠคํŠธ๋ฆผ ์ž‘์—…์„ ๊ฐ€๋Šฅํ•˜๊ฒŒ ํ•ฉ๋‹ˆ๋‹ค.
DynamicScaler: Seamless and Scalable Video Generation for Panoramic Scenes
·1754 words·9 mins· loading · loading
AI Generated ๐Ÿค— Daily Papers Computer Vision Image Generation ๐Ÿข Google DeepMind
DynamicScaler๋Š” ํ…์ŠคํŠธ๋‚˜ ์ด๋ฏธ์ง€์—์„œ ๊ธด ๋Š๊น€ ์—†๋Š” ํŒŒ๋…ธ๋ผ๋งˆ ๋น„๋””์˜ค๋ฅผ ์ƒ์„ฑํ•˜๋ฉฐ, ํ•ด์ƒ๋„์™€ ์ข…ํšก๋น„์— ๊ด€๊ณ„์—†์ด ์ผ๊ด€๋œ ์›€์ง์ž„์„ ์œ ์ง€ํ•ฉ๋‹ˆ๋‹ค.
SplineGS: Robust Motion-Adaptive Spline for Real-Time Dynamic 3D Gaussians from Monocular Video
·3662 words·18 mins· loading · loading
AI Generated ๐Ÿค— Daily Papers Computer Vision 3D Vision ๐Ÿข KAIST
SplineGS: ์‹ค์‹œ๊ฐ„ ๋™์  3D ์žฅ๋ฉด์„ ์œ„ํ•œ ๊ฐ•๋ ฅํ•œ ๋ชจ์…˜ ์ ์‘ํ˜• ์Šคํ”Œ๋ผ์ธ.
Prompt2Perturb (P2P): Text-Guided Diffusion-Based Adversarial Attacks on Breast Ultrasound Images
·1580 words·8 mins· loading · loading
AI Generated ๐Ÿค— Daily Papers Computer Vision Image Generation ๐Ÿข University of British Columbia
P2P: ํ…์ŠคํŠธ ๊ธฐ๋ฐ˜์˜ ์ƒˆ๋กœ์šด ์ ๋Œ€์  ๊ณต๊ฒฉ์œผ๋กœ ์˜๋ฃŒ ์˜์ƒ DNN์˜ ์ทจ์•ฝ์„ฑ ๊ณต๋žต
LinGen: Towards High-Resolution Minute-Length Text-to-Video Generation with Linear Computational Complexity
·3571 words·17 mins· loading · loading
AI Generated ๐Ÿค— Daily Papers Computer Vision Video Understanding ๐Ÿข Princeton University
LinGen: ๋ถ„ ๋‹จ์œ„ ๊ณ ํ•ด์ƒ๋„ ํ…์ŠคํŠธ-ํˆฌ-๋น„๋””์˜ค ์ƒ์„ฑ, ์„ ํ˜• ๊ณ„์‚ฐ ๋ณต์žก๋„๋กœ ํšจ์œจ์„ฑ ๊ทน๋Œ€ํ™”
BrushEdit: All-In-One Image Inpainting and Editing
·3188 words·15 mins· loading · loading
AI Generated ๐Ÿค— Daily Papers Computer Vision Image Generation ๐Ÿข Peking University
BrushEdit: All-in-One Image Inpainting & Editing.
InstanceCap: Improving Text-to-Video Generation via Instance-aware Structured Caption
·3493 words·17 mins· loading · loading
AI Generated ๐Ÿค— Daily Papers Computer Vision Video Understanding ๐Ÿข Nanjing University
InstanceCap: ์ธ์Šคํ„ด์Šค ์ธ์‹ ๊ตฌ์กฐํ™” ์บก์…˜์„ ํ†ตํ•ด ํ…์ŠคํŠธ-๋น„๋””์˜ค ์ƒ์„ฑ์„ ๊ฐœ์„ ํ•ฉ๋‹ˆ๋‹ค.