TokenFlow is an open-source framework for video editing and manipulation that operates directly in the latent space of diffusion models. Instead of regenerating entire frames, TokenFlow tracks and reuses internal tokens from a reference video, enabling temporally consistent edits with significantly reduced flickering and artifacts. The tool is designed for tasks such as video-to-video translation, style transfer, object or attribute editing, and subtle appearance changes while preserving the original motion and structure of the input video.

Using TokenFlow, developers and researchers can perform frame-consistent edits by conditioning a diffusion model on both the original video and a text prompt or modified reference frame. The method propagates visual changes across frames by aligning and reusing latent tokens, which helps maintain coherence in textures, lighting, and object boundaries over time. TokenFlow integrates with Stable Diffusion–based pipelines and can be incorporated into custom workflows for content creation, visual effects, and research on generative video models. It is particularly useful in scenarios where high temporal consistency is required, such as editing faces in videos, applying uniform artistic styles, or modifying specific regions without disrupting overall motion.

TokenFlow

Tags

Launch Team

Comments (0)

Tool Information

Recommended Solutions

Alternatives & Similar Tools

Presentory

Dzine

Tila

Veed

RemixAI

Gliastar

Descript

VideoLDM by Nvidia