Skip to main content

Image Generation

Through-The-Mask: Mask-based Motion Trajectories for Image-to-Video Generation
·2799 words·14 mins· loading · loading
AI Generated πŸ€— Daily Papers Computer Vision Image Generation 🏒 Meta
마슀크 기반 λͺ¨μ…˜ 경둜λ₯Ό μ΄μš©ν•œ 2단계 이미지-λΉ„λ””μ˜€ 생성 ν”„λ ˆμž„μ›Œν¬μΈ THROUGH-THE-MASKκ°€ 닀쀑 객체의 μ •ν™•ν•œ μ• λ‹ˆλ©”μ΄μ…˜μ„ κ°€λŠ₯ν•˜κ²Œ ν•©λ‹ˆλ‹€.
Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models
·2873 words·14 mins· loading · loading
AI Generated πŸ€— Daily Papers Computer Vision Image Generation 🏒 Huazhong University of Science and Technology
고차원 잠재 κ³΅κ°„μ—μ„œμ˜ μ΅œμ ν™” λ”œλ ˆλ§ˆλ₯Ό ν•΄κ²°ν•˜λŠ” VA-VAEλ₯Ό 톡해, 고해상도 이미지 μƒμ„±μ—μ„œ μ΅œμ²¨λ‹¨ μ„±λŠ₯을 달성!
Nested Attention: Semantic-aware Attention Values for Concept Personalization
·1325 words·7 mins· loading · loading
AI Generated πŸ€— Daily Papers Computer Vision Image Generation 🏒 Tel Aviv University
쀑첩 주의 λ©”μ»€λ‹ˆμ¦˜μ„ μ‚¬μš©ν•˜μ—¬ ν…μŠ€νŠΈ-이미지 λͺ¨λΈμ˜ κ°œμΈν™” μ„±λŠ₯을 ν–₯μƒμ‹œν‚¨ Nested Attention 기법 μ œμ‹œ!
MLLM-as-a-Judge for Image Safety without Human Labeling
·5796 words·28 mins· loading · loading
AI Generated πŸ€— Daily Papers Computer Vision Image Generation 🏒 Meta AI
인간 라벨링 없이 사전 μ •μ˜λœ μ•ˆμ „ κ·œμΉ™μ„ μ‚¬μš©ν•˜μ—¬ 사전 ν›ˆλ ¨λœ 닀쀑 λͺ¨λ‹¬ λŒ€ν˜• μ–Έμ–΄ λͺ¨λΈ(MLLM)을 톡해 이미지 μ•ˆμ „μ„±μ„ νŒλ‹¨ν•˜λŠ” μƒˆλ‘œμš΄ μ œλ‘œμƒ· 방법을 μ œμ‹œν•©λ‹ˆλ‹€.
VMix: Improving Text-to-Image Diffusion Model with Cross-Attention Mixing Control
·2196 words·11 mins· loading · loading
AI Generated πŸ€— Daily Papers Computer Vision Image Generation 🏒 ByteDance Inc
VMix: 크둜슀 μ–΄ν…μ…˜ λ―Ήμ‹± μ œμ–΄λ₯Ό ν†΅ν•œ ν…μŠ€νŠΈ-이미지 ν™•μ‚° λͺ¨λΈ κ°œμ„ 
Edicho: Consistent Image Editing in the Wild
·2213 words·11 mins· loading · loading
AI Generated πŸ€— Daily Papers Computer Vision Image Generation 🏒 Hong Kong University of Science and Technology
Edicho: 이미지 κ°„ 일관성 μœ μ§€ν•˜λ©° μ œλ‘œμƒ· 이미지 νŽΈμ§‘ κ°€λŠ₯!
Bringing Objects to Life: 4D generation from 3D objects
·2224 words·11 mins· loading · loading
AI Generated πŸ€— Daily Papers Computer Vision Image Generation 🏒 NVIDIA
3to4D: ν…μŠ€νŠΈ ν”„λ‘¬ν”„νŠΈλ‘œ μ‚¬μš©μž 제곡 3D 객체λ₯Ό μ‹€κ°λ‚˜κ²Œ μ• λ‹ˆλ©”μ΄μ…˜ν™”!
VideoMaker: Zero-shot Customized Video Generation with the Inherent Force of Video Diffusion Models
·3812 words·18 mins· loading · loading
AI Generated πŸ€— Daily Papers Computer Vision Image Generation 🏒 Tencent AI Lab
VideoMaker: μ˜μƒ ν™•μ‚° λͺ¨λΈμ˜ κ³ μœ ν•œ νž˜μ„ μ΄μš©ν•œ μ œλ‘œμƒ· λ§žμΆ€ν˜• μ˜μƒ 생성
1.58-bit FLUX
·1092 words·6 mins· loading · loading
AI Generated πŸ€— Daily Papers Computer Vision Image Generation 🏒 ByteDance
1.58-bit FLUX: 99.5%의 νŒŒλΌλ―Έν„°λ₯Ό 1.58-bit둜 μ–‘μžν™”ν•˜μ—¬ λͺ¨λΈ 크기 7.7λ°°, μΆ”λ‘  λ©”λͺ¨λ¦¬ 5.1λ°° κ°μ†Œ, κ³ ν’ˆμ§ˆ 이미지 생성 μœ μ§€!
Distilled Decoding 1: One-step Sampling of Image Auto-regressive Models with Flow Matching
·3113 words·15 mins· loading · loading
AI Generated πŸ€— Daily Papers Computer Vision Image Generation 🏒 Tsinghua University
단일 단계 μƒ˜ν”Œλ§μœΌλ‘œ 이미지 μžλ™ νšŒκ·€ λͺ¨λΈ 속도λ₯Ό 획기적으둜 ν–₯μƒμ‹œν‚¨ 증λ₯˜ λ””μ½”λ”©(DD) 기법 μ œμ•ˆ!
CLEAR: Conv-Like Linearization Revs Pre-Trained Diffusion Transformers Up
·3581 words·17 mins· loading · loading
AI Generated πŸ€— Daily Papers Computer Vision Image Generation 🏒 National University of Singapore
CLEAR: μ„ ν˜•ν™”λœ μ–΄ν…μ…˜μœΌλ‘œ 고해상도 이미지 생성 속도λ₯Ό 획기적으둜 높이닀!
UIP2P: Unsupervised Instruction-based Image Editing via Cycle Edit Consistency
·2616 words·13 mins· loading · loading
AI Generated πŸ€— Daily Papers Computer Vision Image Generation 🏒 ETH Zurich
비지도 ν•™μŠ΅ 기반 μˆœν™˜ νŽΈμ§‘ 일관성(CEC) ν™œμš©, μ§€μ‹œμ–΄ 기반 이미지 νŽΈμ§‘μ˜ μƒˆλ‘œμš΄ 지평을 μ—΄λ‹€!
Parallelized Autoregressive Visual Generation
·3557 words·17 mins· loading · loading
AI Generated πŸ€— Daily Papers Computer Vision Image Generation 🏒 Peking University
λ³Έ μ—°κ΅¬λŠ” 토큰 μ˜μ‘΄μ„±μ„ κ³ λ €ν•œ 병렬화 μ „λž΅μ„ 톡해 μžλ™ νšŒκ·€ μ‹œκ°μ  μƒμ„±μ˜ 속도λ₯Ό μ΅œλŒ€ 9.5λ°°κΉŒμ§€ ν–₯μƒμ‹œμΌ°μŠ΅λ‹ˆλ‹€.
LeviTor: 3D Trajectory Oriented Image-to-Video Synthesis
·2184 words·11 mins· loading · loading
AI Generated πŸ€— Daily Papers Computer Vision Image Generation 🏒 Hong Kong University of Science and Technology
LeviTor: μ‚¬μš©μžμ˜ κ°„νŽΈν•œ 3D ꢀ적 μž…λ ₯만으둜 사싀적인 λΉ„λ””μ˜€ 합성이 κ°€λŠ₯ν•œ ν˜μ‹ μ μΈ λͺ¨λΈ!
Affordance-Aware Object Insertion via Mask-Aware Dual Diffusion
·3112 words·15 mins· loading · loading
AI Generated πŸ€— Daily Papers Computer Vision Image Generation 🏒 Harvard University
Affordance-Aware Object Insertion: λ°°κ²½κ³Ό μ „κ²½μ˜ μƒν˜Έμž‘μš©μ„ κ³ λ €ν•œ ν˜„μ‹€μ μΈ 이미지 ν•©μ„± 기술!
PixelMan: Consistent Object Editing with Diffusion Models via Pixel Manipulation and Generation
·3040 words·15 mins· loading · loading
AI Generated πŸ€— Daily Papers Computer Vision Image Generation 🏒 Dept. ECE, University of Alberta
PixelMan은 ν”½μ…€ μ‘°μž‘ 및 생성을 톡해 ν›ˆλ ¨ 없이도 일관성 μžˆλŠ” 객체 νŽΈμ§‘μ„ 16단계 λ§Œμ— λ‹¬μ„±ν•˜λŠ” ν˜μ‹ μ μΈ ν™•μ‚° λͺ¨λΈ 기반 λ°©λ²•μž…λ‹ˆλ‹€.
FashionComposer: Compositional Fashion Image Generation
·2170 words·11 mins· loading · loading
AI Generated πŸ€— Daily Papers Computer Vision Image Generation 🏒 University of Hong Kong
FashionComposer: λ‹€μ–‘ν•œ μž…λ ₯(ν…μŠ€νŠΈ, μ˜μƒ 이미지, 3D λͺ¨λΈ)을 ν™œμš©ν•΄ 사싀적인 νŒ¨μ…˜ 이미지λ₯Ό ν•©μ„±ν•˜λŠ” ν˜μ‹ μ μΈ ν”„λ ˆμž„μ›Œν¬!
Autoregressive Video Generation without Vector Quantization
·3553 words·17 mins· loading · loading
AI Generated πŸ€— Daily Papers Computer Vision Image Generation 🏒 BAAI
벑터 μ–‘μžν™” 없이도 효율적이고 μœ μ—°ν•œ μžκΈ°νšŒκ·€ λΉ„λ””μ˜€ 생성 λͺ¨λΈ, NOVA 개발!
ChatDiT: A Training-Free Baseline for Task-Agnostic Free-Form Chatting with Diffusion Transformers
·1484 words·7 mins· loading · loading
AI Generated πŸ€— Daily Papers Computer Vision Image Generation 🏒 Tongyi Lab
ChatDiT: μ œλ‘œμƒ· λ°©μ‹μœΌλ‘œ 사전 ν›ˆλ ¨λœ ν™•μ‚° λ³€ν™˜κΈ°λ₯Ό ν™œμš©, μžμ—°μ–΄λ‘œ λ‹€μ–‘ν•œ μ‹œκ°μ  과제 ν•΄κ²°!
Nearly Zero-Cost Protection Against Mimicry by Personalized Diffusion Models
·3489 words·17 mins· loading · loading
AI Generated πŸ€— Daily Papers Computer Vision Image Generation 🏒 Inha University
μ‹€μ‹œκ°„ 이미지 보호, λ”₯페이크 λŒ€λΉ„μ±….