StableDrag: Stable Dragging for Point-based Image Editing

Anonymous Institutions
MY ALT TEXT

StableDrag-GAN

MY ALT TEXT

StableDrag-GAN

MY ALT TEXT

StableDrag-GAN

MY ALT TEXT

StableDrag-GAN

MY ALT TEXT

StableDrag-GAN

MY ALT TEXT

StableDrag-GAN

MY ALT TEXT

StableDrag-GAN

MY ALT TEXT

StableDrag-GAN

MY ALT TEXT

StableDrag-GAN

MY ALT TEXT

StableDrag-GAN

MY ALT TEXT

StableDrag-Diff

MY ALT TEXT

StableDrag-Diff

MY ALT TEXT

StableDrag-Diff

Abstract

Point-based image editing has attracted remarkable attention since the emergence of DragGAN. Recently, DragDiffusion further pushes forward the generative quality via adapting this dragging technique to diffusion models. Despite these great success, this dragging scheme exhibits two major drawbacks, namely inaccurate point tracking and incomplete motion supervision, which may result in unsatisfactory dragging outcomes. To tackle these issues, we build a stable and precise drag-based editing framework, coined as StableDrag, by designing a discirminative point tracking method and a confidence-based latent enhancement strategy for motion supervision. The former allows us to precisely locate the updated handle points, thereby boosting the stability of long-range manipulation, while the latter is responsible for guaranteeing the optimized latent as high-quality as possible across all the manipulation steps. Thanks to these unique designs, we instantiate two types of image editing models including StableDrag-GAN and StableDrag-Diff, which attains more stable dragging performance, through extensive qualitative experiments and quantitative assessment on DragBench.

MY ALT TEXT

Figure 1. Illustration of our dragging scheme for an intermediate single-step optimization. The core of the dragging pipeline illustrated herein is based on GAN, whereas the one based on diffusion models remains the same.

More results of our StableDrag

Comparison between StableDrag-GAN and FreeDrag