Spark Structured Streaming Gets Real-Time Mode for Ultra-Low Latency

Apache Spark Structured Streaming has long served as a robust foundation for high-scale, mission-critical data pipelines, enabling everything from sophisticated streaming Extract, Transform, Load (ETL) operations to near real-time analytics and machine learning applications. Now, a significant evolution is underway with the introduction of “real-time mode,” a new trigger type designed to extend Spark’s capabilities into an entirely new class of ultra-low-latency workloads. This innovation processes events as they arrive, achieving latencies as low as the tens of milliseconds, marking a substantial leap forward for immediate data processing needs.

Unlike its predecessors, such as the ProcessingTime trigger which operates on fixed schedules, or the AvailableNow trigger that processes all available data before halting, real-time mode offers continuous data processing. This means results are emitted the moment they are ready, enabling truly instantaneous use cases. Businesses can now leverage Spark for scenarios demanding immediate action, such as real-time fraud detection, dynamic personalization, and serving machine learning features with minimal delay. Crucially, this enhanced capability integrates seamlessly, requiring no changes to existing codebases or replatforming efforts for current Structured Streaming users. This new mode is being contributed to the open-source Apache Spark project and is currently accessible in Public Preview on Databricks.

At its core, real-time mode delivers continuous, low-latency processing by running long-lived streaming jobs that schedule stages concurrently. Data flows between tasks in memory, facilitated by a technique known as streaming shuffle. This approach significantly reduces coordination overhead, eliminates the fixed scheduling delays inherent in traditional micro-batch modes, and consistently delivers sub-second performance. Internal Databricks tests have demonstrated impressive performance, with 99th percentile (p99) latencies ranging from just a few milliseconds to approximately 300 milliseconds, depending on the complexity of the data transformations involved. For instance, some benchmarks showed p99 latencies as low as single-digit milliseconds, affirming its capacity for demanding real-time applications.

The implications of real-time mode are profound for streaming applications that require rapid response times, especially those integral to critical business operations. Early adopters have already deployed this technology across a diverse array of workloads, showcasing its transformative potential. For example, a global bank is now processing credit card transactions from Kafka in real time, flagging suspicious activity within an impressive 200 milliseconds. This capability drastically reduces financial risk and accelerates response times without necessitating a complete overhaul of their existing infrastructure.

Beyond financial services, real-time mode is revolutionizing personalized experiences in retail and media. An Over-The-Top (OTT) streaming provider can now update content recommendations immediately after a user finishes watching a show, while a leading e-commerce platform recalculates product offers as customers browse, maintaining high engagement through sub-second feedback loops. Similarly, major travel sites are leveraging this mode to track and surface each user’s recent searches across devices in real time, with every new query instantly updating the session cache, thereby enabling personalized results and autofill without any perceptible delay. In the realm of machine learning, a food delivery application is using real-time mode to update features like driver location and preparation times in milliseconds. These updates feed directly into machine learning models and user-facing applications, leading to improved estimated time of arrival (ETA) accuracy and an enhanced customer experience.

Raja Kanchumarthi, Lead Data Engineer at Network International, highlighted the tangible benefits, stating, “For a mission-critical payments authorization pipeline, where we perform encryption and other transformations, we achieved P99 end-to-end latency of just 15 milliseconds. We’re optimistic about scaling this low-latency processing across our data flows while consistently meeting strict SLAs.” These diverse applications underscore the versatility of real-time mode, demonstrating its capacity to turn data into immediate decisions across a spectrum of use cases, from IoT sensor alerts and supply chain visibility to live gaming telemetry and in-app personalization.

Spark Structured Streaming Gets Real-Time Mode for Ultra-Low Latency

Related Articles

Streamlining PyTorch Model Deployment with New torch.export API

7 Python Scripts for Weekly Automation & Time-Saving

Build Dashboards with Vizro: McKinsey's Open-Source Python Toolkit

Related Articles

▸
Streamlining PyTorch Model Deployment with New torch.export API

▸
7 Python Scripts for Weekly Automation & Time-Saving

▸
Build Dashboards with Vizro: McKinsey's Open-Source Python Toolkit