Computer Vision
AniDoc: Animation Creation Made Easier
·1844 words·9 mins·
loading
·
loading
AI Generated
๐ค Daily Papers
Computer Vision
Video Understanding
๐ข Hong Kong University of Science and Technology
AniDoc: ํฌ์ ์ค์ผ์น์ ์ฐธ์กฐ ์ด๋ฏธ์ง๋ฅผ ํ์ฉ, 2D ์ ๋๋ฉ์ด์
์๋ ์ฑ์ ๋ฐ ๋ณด๊ฐ์ ๊ตฌํํ๋ ํ์ ์ AI ๋ชจ๋ธ!
VidTok: A Versatile and Open-Source Video Tokenizer
·2469 words·12 mins·
loading
·
loading
AI Generated
๐ค Daily Papers
Computer Vision
Video Understanding
๐ข Microsoft Research
VidTok: ์คํ์์ค ๊ณ ์ฑ๋ฅ ๋น๋์ค ํ ํฌ๋์ด์ ๊ฐ ์ฐ์ ๋ฐ ์ด์ฐ ํ ํฐํ์์ ์ต์ฒจ๋จ ์ฑ๋ฅ์ ๋ฌ์ฑํ๋ฉฐ, ํจ์จ์ ์ธ ํ์ต ์ ๋ต๊ณผ ํ์ ์ ์ธ ์์ํ ๊ธฐ๋ฒ์ ํตํด ์์ ์์ฑ ๋ฐ ์ดํด ์ฐ๊ตฌ์ ์๋ก์ด ๊ฐ๋ฅ์ฑ์ ์ด์์ต๋๋ค.
Move-in-2D: 2D-Conditioned Human Motion Generation
·1943 words·10 mins·
loading
·
loading
AI Generated
๐ค Daily Papers
Computer Vision
Video Understanding
๐ข Adobe Research
Move-in-2D: 2D ์ด๋ฏธ์ง์ ํ
์คํธ ํ๋กฌํํธ๋ก ํ์ค์ ์ธ ์ธ๊ฐ ๋์ ์์ฑ
ChatDiT: A Training-Free Baseline for Task-Agnostic Free-Form Chatting with Diffusion Transformers
·1484 words·7 mins·
loading
·
loading
AI Generated
๐ค Daily Papers
Computer Vision
Image Generation
๐ข Tongyi Lab
ChatDiT: ์ ๋ก์ท ๋ฐฉ์์ผ๋ก ์ฌ์ ํ๋ จ๋ ํ์ฐ ๋ณํ๊ธฐ๋ฅผ ํ์ฉ, ์์ฐ์ด๋ก ๋ค์ํ ์๊ฐ์ ๊ณผ์ ํด๊ฒฐ!
Wonderland: Navigating 3D Scenes from a Single Image
·2841 words·14 mins·
loading
·
loading
AI Generated
๐ค Daily Papers
Computer Vision
3D Vision
๐ข University of Toronto
๋จ์ผ ์ด๋ฏธ์ง๋ก ๊ณ ํ์ง 3D ์ฅ๋ฉด์ ์์ฑํ๋ ํจ์จ์ ์ด๊ณ ํ์ฅ ๊ฐ๋ฅํ ํ๋ ์์ํฌ
StrandHead: Text to Strand-Disentangled 3D Head Avatars Using Hair Geometric Priors
·1741 words·9 mins·
loading
·
loading
AI Generated
๐ค Daily Papers
Computer Vision
3D Vision
๐ข Nanjing University
’’ StrandHead: ํ
์คํธ๋ง์ผ๋ก ์ฌ์ค์ ์ธ 3D ํค๋ ์๋ฐํ์ ์ฌ์ธํ ํค์ด์คํ์ผ๊น์ง ์์ฑ.''
Sequence Matters: Harnessing Video Models in 3D Super-Resolution
·3903 words·19 mins·
loading
·
loading
AI Generated
๐ค Daily Papers
Computer Vision
3D Vision
๐ข Department of Electrical and Computer Engineering, Sungkyunkwan University
๋น๋์ค ์ดํด์๋ ๋ชจ๋ธ์ ์ด์ฉํ ํ์ ์ ์ธ 3D ์ดํด์๋ ๊ธฐ๋ฒ์ผ๋ก, ์ ๋ ฌ ๊ณผ์ ์์ด๋ ์ต์ฒจ๋จ ์ฑ๋ฅ ๋ฌ์ฑ!
Nearly Zero-Cost Protection Against Mimicry by Personalized Diffusion Models
·3489 words·17 mins·
loading
·
loading
AI Generated
๐ค Daily Papers
Computer Vision
Image Generation
๐ข Inha University
์ค์๊ฐ ์ด๋ฏธ์ง ๋ณดํธ, ๋ฅํ์ดํฌ ๋๋น์ฑ
.
MOVIS: Enhancing Multi-Object Novel View Synthesis for Indoor Scenes
·2949 words·14 mins·
loading
·
loading
AI Generated
๐ค Daily Papers
Computer Vision
3D Vision
๐ข Peking University
MOVIS๋ ์ค๋ด ์ฅ๋ฉด์ ๋ํ ๋ฉํฐ-๊ฐ์ฒด novel view synthesis์์ ๊ตฌ์กฐ์ ์ธ์์ ํฅ์์์ผ ์ผ๊ด์ฑ ์๊ณ ์ฌ์ค์ ์ธ novel view๋ฅผ ์์ฑํฉ๋๋ค.
IDArb: Intrinsic Decomposition for Arbitrary Number of Input Views and Illuminations
·3273 words·16 mins·
loading
·
loading
AI Generated
๐ค Daily Papers
Computer Vision
3D Vision
๐ข Chinese University of Hong Kong
IDArb: Decomposition under varied lights.
ColorFlow: Retrieval-Augmented Image Sequence Colorization
·2273 words·11 mins·
loading
·
loading
AI Generated
๐ค Daily Papers
Computer Vision
Image Generation
๐ข Tsinghua University
๋งํ ์ฑ์ ์๋ํ: ColorFlow๋ ID ์ผ๊ด์ฑ์ ์ ์งํ๋ฉด์ ํ๋ฐฑ ๋งํ ์ํ์ค๋ฅผ ์ฑ์ํฉ๋๋ค.
Causal Diffusion Transformers for Generative Modeling
·4953 words·24 mins·
loading
·
loading
AI Generated
๐ค Daily Papers
Computer Vision
Image Generation
๐ข ByteDance Research
CausalFusion์ ํ์ฐ ๋ฐ ์๊ธฐ ํ๊ท ๋ชจ๋ธ์ ๊ฒฐํฉํ์ฌ ์์ฑ ๋ชจ๋ธ๋ง์์ ์ต์ฒจ๋จ ๊ฒฐ๊ณผ๋ฅผ ๋ฌ์ฑํ๊ณ ์๋ก์ด ๊ธฐ๋ฅ์ ๊ฐ๋ฅํ๊ฒ ํฉ๋๋ค.
VividFace: A Diffusion-Based Hybrid Framework for High-Fidelity Video Face Swapping
·1707 words·9 mins·
loading
·
loading
AI Generated
๐ค Daily Papers
Computer Vision
Image Generation
๐ข CUHK MMLab
VividFace: ์ฒซ ๋ฒ์งธ ํ์ฐ ๊ธฐ๋ฐ ๋น๋์ค ์ผ๊ตด ๋ฐ๊พธ๊ธฐ ํ๋ ์์ํฌ๋ก ๊ณ ์ถฉ์ค๋ ๊ฒฐ๊ณผ ์ ๊ณต.
GaussianProperty: Integrating Physical Properties to 3D Gaussians with LMMs
·2657 words·13 mins·
loading
·
loading
AI Generated
๐ค Daily Papers
Computer Vision
3D Vision
๐ข Hong Kong University of Science and Technology
GaussianProperty๋ LMM์ ์ฌ์ฉํ์ฌ 3D ๊ฐ์ฐ์์์ ๋ฌผ๋ฆฌ์ ์์ฑ์ ํตํฉํ๋ ํ๋ จ ์๋ ํ๋ ์์ํฌ๋ก, ๋ฌผ๋ฆฌ ๊ธฐ๋ฐ ์๋ฎฌ๋ ์ด์
๋ฐ ๋ก๋ด ์ฅ๊ธฐ์ ๊ฐ์ ๋ค์ด์คํธ๋ฆผ ์์
์ ๊ฐ๋ฅํ๊ฒ ํฉ๋๋ค.
DynamicScaler: Seamless and Scalable Video Generation for Panoramic Scenes
·1754 words·9 mins·
loading
·
loading
AI Generated
๐ค Daily Papers
Computer Vision
Image Generation
๐ข Google DeepMind
DynamicScaler๋ ํ
์คํธ๋ ์ด๋ฏธ์ง์์ ๊ธด ๋๊น ์๋ ํ๋
ธ๋ผ๋ง ๋น๋์ค๋ฅผ ์์ฑํ๋ฉฐ, ํด์๋์ ์ข
ํก๋น์ ๊ด๊ณ์์ด ์ผ๊ด๋ ์์ง์์ ์ ์งํฉ๋๋ค.
SplineGS: Robust Motion-Adaptive Spline for Real-Time Dynamic 3D Gaussians from Monocular Video
·3662 words·18 mins·
loading
·
loading
AI Generated
๐ค Daily Papers
Computer Vision
3D Vision
๐ข KAIST
SplineGS: ์ค์๊ฐ ๋์ 3D ์ฅ๋ฉด์ ์ํ ๊ฐ๋ ฅํ ๋ชจ์
์ ์ํ ์คํ๋ผ์ธ.
Prompt2Perturb (P2P): Text-Guided Diffusion-Based Adversarial Attacks on Breast Ultrasound Images
·1580 words·8 mins·
loading
·
loading
AI Generated
๐ค Daily Papers
Computer Vision
Image Generation
๐ข University of British Columbia
P2P: ํ
์คํธ ๊ธฐ๋ฐ์ ์๋ก์ด ์ ๋์ ๊ณต๊ฒฉ์ผ๋ก ์๋ฃ ์์ DNN์ ์ทจ์ฝ์ฑ ๊ณต๋ต
LinGen: Towards High-Resolution Minute-Length Text-to-Video Generation with Linear Computational Complexity
·3571 words·17 mins·
loading
·
loading
AI Generated
๐ค Daily Papers
Computer Vision
Video Understanding
๐ข Princeton University
LinGen: ๋ถ ๋จ์ ๊ณ ํด์๋ ํ
์คํธ-ํฌ-๋น๋์ค ์์ฑ, ์ ํ ๊ณ์ฐ ๋ณต์ก๋๋ก ํจ์จ์ฑ ๊ทน๋ํ
BrushEdit: All-In-One Image Inpainting and Editing
·3188 words·15 mins·
loading
·
loading
AI Generated
๐ค Daily Papers
Computer Vision
Image Generation
๐ข Peking University
BrushEdit: All-in-One Image Inpainting & Editing.
InstanceCap: Improving Text-to-Video Generation via Instance-aware Structured Caption
·3493 words·17 mins·
loading
·
loading
AI Generated
๐ค Daily Papers
Computer Vision
Video Understanding
๐ข Nanjing University
InstanceCap: ์ธ์คํด์ค ์ธ์ ๊ตฌ์กฐํ ์บก์
์ ํตํด ํ
์คํธ-๋น๋์ค ์์ฑ์ ๊ฐ์ ํฉ๋๋ค.