
TokenFlow
TokenFlow is a video editing framework that enables localized, consistent, and temporally coherent modifications by propagating diffusion model attention maps across frames.
TokenFlow is an open-source framework for video editing and manipulation that operates directly in the latent space of diffusion models. Instead of regenerating entire frames, TokenFlow tracks and reuses internal tokens from a reference video, enabling temporally consistent edits with significantly reduced flickering and artifacts. The tool is designed for tasks such as video-to-video translation, style transfer, object or attribute editing, and subtle appearance changes while preserving the original motion and structure of the input video.
Using TokenFlow, developers and researchers can perform frame-consistent edits by conditioning a diffusion model on both the original video and a text prompt or modified reference frame. The method propagates visual changes across frames by aligning and reusing latent tokens, which helps maintain coherence in textures, lighting, and object boundaries over time. TokenFlow integrates with Stable Diffusion–based pipelines and can be incorporated into custom workflows for content creation, visual effects, and research on generative video models. It is particularly useful in scenarios where high temporal consistency is required, such as editing faces in videos, applying uniform artistic styles, or modifying specific regions without disrupting overall motion.
Tags
Launch Team
Alternatives & Similar Tools
Explore 50 top alternatives to TokenFlow

AI Video Cut
AI Video Cut is an AI-powered tool that automatically edits, trims, and repurposes long-form videos into shorter clips optimized for social media and content platforms.

Pictory AI
Pictory AI is an AI-powered video creation and editing tool designed to turn text, scripts, or long-

LiveMemory
LiveMemory by MyHeritage is an AI-powered photo animation tool that transforms still images into sho

LipSync.video
LipSync.video is a web-based tool that generates lip-synced talking videos from text or audio by animating static images or existing footage.
Comments (0)
Please sign in to comment
💬 No comments yet
Be the first to share your thoughts!



