09 September 2024 | 10 min read
Implementing a Scalable, Pull-Based Video Processing Architecture Using Google Cloud Run, Pub/Sub, and Autoscaling
DarkHorse
In the fast-evolving world of digital media, the need for scalable and efficient video processing solutions is paramount. Whether it's content delivery platforms, media companies, or any service dealing with large volumes of video data, managing and processing video workloads can be both challenging and resource-intensive. This solution has a profound impact on the Darkhorse platform, helping athletes and their parents access videos regardless of their location. Whether they’re sitting on a small island like San Juan Island off the coast of the U.S. or in the heart of New York City, they can seamlessly fetch and analyze video content, empowering their journey in sports development with reliable, high-performance media access.To meet these demands, we’ve architected a dynamic, pull-based video processing system leveraging Google Cloud's robust suite of services, including Google Cloud Run, Pub/Sub, Cloud Scheduler, and Cloud Functions. This blog post dives into the technical details of how we built this system, the challenges we addressed, and the innovations we implemented to achieve scalability, efficiency, and cost optimization.
The Core Challenge: Managing High-Volume Video Processing
Video processing involves multiple computationally heavy tasks such as encoding, transcoding, thumbnail generation, and more. These tasks need to be handled efficiently to meet user expectations for timely content delivery. Our key challenges were:
- Scalability: The system must handle spikes in video processing requests during peak times without degradation in performance.
- Cost-Efficiency: The infrastructure must scale down during periods of low demand to minimize operational costs.
- Task Management: We needed an efficient mechanism to distribute video processing tasks and ensure they were processed promptly.
Designing the Solution: A Pull-Based Architecture
To address these challenges, we designed a pull-based architecture that integrates several Google Cloud services to create a highly scalable and flexible video processing pipeline. Here's a detailed breakdown of our solution:
1. Google Pub/Sub: Asynchronous Task Distribution
Google Pub/Sub is the messaging backbone of our system. Whenever a new video needs processing, a message containing all necessary metadata (e.g., video URL, processing parameters, priority levels) is published to a specific Pub/Sub topic. This decouples the process of task creation from task execution, allowing our system to handle varying loads more gracefully.
Each video processing request is encapsulated in a Pub/Sub message, enabling us to queue, distribute, and manage these tasks asynchronously. By utilizing Pub/Sub's push and pull subscription models, we opted for a pull-based approach where processing tasks are pulled from the queue as resources become available, preventing system overload.
2. Google Cloud Scheduler: Regular Polling and Task Triggering
Our pull-based mechanism is powered by Google Cloud Scheduler, a fully managed cron-like service that triggers jobs at regular intervals. We configured Cloud Scheduler to run every few minutes to poll our Pub/Sub subscription for new messages.
The scheduler executes a custom script or API that pulls a batch of messages from the Pub/Sub subscription. This controlled polling mechanism allows us to dictate how frequently our system checks for new tasks and helps in distributing workloads evenly across available processing resources.
3. Google Cloud Run Jobs: Containerized Video Processing
At the heart of our video processing pipeline are Google Cloud Run jobs. Cloud Run allows us to deploy containerized applications that automatically scale based on incoming requests. For video processing, each task is handled by an instance of a Cloud Run job, which processes the video as specified in the Pub/Sub message.
The benefits of using Cloud Run jobs include:
- Isolation: Each video processing task runs in its own container, ensuring that tasks are isolated from each other, reducing the risk of cross-task interference.
- Scalability: Cloud Run automatically scales the number of container instances based on demand, ensuring that we have enough processing power during peak times.
- Flexibility: By containerizing our processing logic, we can use any programming language, framework, or library that suits our needs, making the development and deployment process more agile.
4. Autoscaling with Google Cloud Functions: Dynamic Resource Management
To further optimize our system's performance, we implemented an autoscaler using Google Cloud Functions. This Cloud Function is also triggered by Cloud Scheduler, running every few minutes to assess the current system load.
The autoscaler function queries the Pub/Sub subscription to determine the number of pending messages (i.e., unprocessed video tasks). Based on the load, the autoscaler dynamically adjusts the number of active Cloud Run jobs:
- Scaling Up: If the number of pending tasks exceeds a predefined threshold, the autoscaler increases the number of Cloud Run job instances, allowing more tasks to be processed concurrently.
- Scaling Down: Conversely, if the load decreases, the autoscaler reduces the number of active instances to conserve resources and minimize costs.
The autoscaler function is a critical component of our system, ensuring that we always have the right amount of processing power available to meet demand without over-provisioning resources.
5. System Workflow: From Task Creation to Completion
To provide a comprehensive overview, let's walk through a typical workflow from task creation to completion in our video processing system:
1. Task Creation: When a new video processing request is generated (e.g., a user uploads a video that needs encoding), a message is published to the relevant Google Pub/Sub topic.
2. Task Polling: Cloud Scheduler triggers a polling process every few minutes, pulling new messages from the Pub/Sub subscription. Each message represents a video processing task.
3. Task Execution: Pulled messages are passed to Google Cloud Run jobs, which are automatically instantiated to process the videos. Each job reads the metadata from the Pub/Sub message, downloads the video from the provided URL, processes it according to the specified parameters, and uploads the processed video back to the designated storage.
4. Dynamic Scaling: The autoscaler function, triggered by another Cloud Scheduler job, checks the load by querying the number of unprocessed messages in Pub/Sub. If the load is high, it scales up the number of Cloud Run job instances. If the load decreases, it scales down the instances, ensuring efficient use of resources.
5. Task Completion: Once a video is processed, the Cloud Run job completes, and the processed video is made available for download or further use.
Technical Challenges and Solutions
- Handling Burst Traffic
Our system was designed with the understanding that video processing requests might not arrive at a steady rate. We used the pull-based model to prevent the system from being overwhelmed during sudden spikes. The Cloud Run autoscaler and dynamic job scaling were essential in managing these bursts effectively.
- Cost Optimization
By integrating the autoscaler function, we ensured that resources were only scaled up when absolutely necessary and scaled down during low-traffic periods. This approach minimized our operational costs, as we avoided the overhead associated with running idle instances.
- Monitoring and Logging
We integrated Stackdriver Logging and Monitoring to keep track of job execution times, resource usage, and system health. This allowed us to quickly identify and resolve bottlenecks or issues in the processing pipeline.
Summary
Our architecture leverages the power of Google Cloud's managed services to create a dynamic, scalable, and cost-effective video processing system. By combining Google Cloud Run jobs with Pub/Sub, Cloud Scheduler, and a custom autoscaler function, we built a robust pipeline capable of handling fluctuating workloads while optimizing resource usage.
This pull-based mechanism not only meets our current video processing needs but also provides a flexible foundation for future enhancements. Whether we need to support new video formats, integrate additional processing steps, or scale to even higher volumes, our system is designed to adapt and grow with our requirements.
For organizations facing similar challenges in video processing or any other resource-intensive workloads, our approach demonstrates how Google Cloud's serverless and containerized offerings can be harnessed to build scalable, efficient, and cost-effective solutions.
DarkHorse