Google’s AutoFlip- Google's new 'AutoFlip' AI can Intelligently Crop any videos for any Scree Size Ratios.Google working on an 'AutoFlip' AI Open Source Framework for Intelligent Video Reframing.
Google’s AutoFlip- Google's new 'AutoFlip' AI can Intelligently Crop any videos for any Scree Size Ratios-
Videos filmed and edited for television and desktop are typically created and viewed in landscape aspect ratios (16:9 or 4:3). but problematically, aspect ratios like (16:9 and 4:3) don’t always fit the display being used for viewing, sometimes we are in a huge rush to search how to crop videos? To, get rid of this problem Google working on an 'AutoFlip' AI Open Source Framework for Intelligent Video Reframing.
Google by its Blog Post today shows working on the new development of Open Source Framework for Intelligent Video Reframing called 'AutoFlip'.
AutoFlip provides a solution for intelligent, automated and adaptive video reframing.
Google's AutoFlip AI:
AutoFlip analyzes the video content, develops optimal tracking and cropping strategies, and produces an output video with the same duration in the desired aspect ratio.
How it Works?
AutoFlip AI works as a 'Video Dimension Conversions' and works in various three stages,- Shot (scene) detection.
- Video Content Analysis.
- Reframing.
Three stages of Google Autoflip AI |
- Shot (Scene) Detection- The first stage is scene detection, in which the machine learning model needs to detect the point before a cut or a jump from one scene to another. So it compares one frame with the previous one before to detect the change of colors and elements.
- Video Content Analysis- In this stage of Video Content Analysis, the AI can utilize deep learning-based object detection models to find interesting, salient content in the frame. This content typically includes people and animals, but other elements may be identified, depending on the application, including text overlays and logos for commercials, or motion and ball detection for sports. AutoFlip also taps AI-based object detection models to find interesting content in the frame, like people, animals, text overlays, logos, and motion. Face and object detection models are integrated with AutoFlip through MediaPipe, a framework that enables the development of pipelines for processing multimodal data, which uses Google’s TensorFlow Lite machine learning framework on processors. This structure allows AutoFlip to be extensible, according to Google, so developers can add detection algorithms for different use cases and video content.
- Reframing- Reframing is the Final Stage, the AI model determines if it will use stationary mode for scenes that take place in a single space, or tracking mode for when objects of interest are constantly moving. Based on that, and the target dimensions in which the video needs to be displayed, Autoflip will crop frames while reducing jitter and retaining the content of interest.
Google researchers said in a note via a blog Google AI Blog.,
We are excited to release this tool directly to developers and filmmakers; - Like any machine learning algorithm, AutoFlip can benefit from an improved ability to detect objects relevant to the intent of the video, such as speaker detection for interviews or animated face detection on cartoons. Additionally, a common issue arises when input video has important overlays on the edges of the screen (such as text or logos) as they will often be cropped from the view. By combining text/logo detection and image inpainting technology, we hope that future versions of AutoFlip can reposition foreground objects to better fit the new aspect ratios. Lastly, in situations where padding is required, deep uncrop technology could provide improved ability to expand beyond the original viewable area.
You can checkout Autoflip’s code here.
COMMENTS