
TokenFlow is a video editing framework that enables localized, consistent, and temporally coherent modifications by propagating diffusion model attention maps across frames.
TokenFlow is an open-source framework for video editing and manipulation that operates directly in the latent space of diffusion models. Instead of regenerating entire frames, TokenFlow tracks and reuses internal tokens from a reference video, enabling temporally consistent edits with significantly reduced flickering and artifacts. The tool is designed for tasks such as video-to-video translation, style transfer, object or attribute editing, and subtle appearance changes while preserving the original motion and structure of the input video.
Using TokenFlow, developers and researchers can perform frame-consistent edits by conditioning a diffusion model on both the original video and a text prompt or modified reference frame. The method propagates visual changes across frames by aligning and reusing latent tokens, which helps maintain coherence in textures, lighting, and object boundaries over time. TokenFlow integrates with Stable Diffusion–based pipelines and can be incorporated into custom workflows for content creation, visual effects, and research on generative video models. It is particularly useful in scenarios where high temporal consistency is required, such as editing faces in videos, applying uniform artistic styles, or modifying specific regions without disrupting overall motion.
Please sign in to comment
💬 No comments yet
Be the first to share your thoughts!
Explore 486+ top alternatives to TokenFlow

Videoscribe is an AI-powered video creation tool that enables users to generate animated explainer videos and whiteboard-style presentations from text, images, and voiceovers.
Presentory is an AI-powered presentation tool that generates slide layouts, designs, and visual content to help users create structured, engaging presentations efficiently.