Welcome!

SOA & WOA Authors: Jeremy Geelan, Kevin Jackson, Maureen O'Gara, John Savageau, Greg Ness

Related Topics: SOA & WOA

SOA & WOA: Article

Introduction to Complex Event Processing & Data Streams

OMG's Data Distribution Service data streams when you need high performance from complex event processing systems

How to Select the Right Messaging Protocol for Your CEP Engine
Selecting the right CEP engine is only one part of building your event processing solution. The other significant part of the architecture is selecting the right messaging bus for your data stream. With the correct selection, you can fully leverage the high-performance CEP engine, scale to a large number of nodes, and do more. Consider the use case of trading desk applications where microseconds matter in identifying a trend for buying or selling a stock. Regardless of how fast the CEP engine may be a typical JMS implementation that usually delivers messages in tens of milliseconds will be a non-starter. (Figure 1)

Here are some of the criteria to consider in adopting a messaging bus for your complex event processing needs:

•  Latency: CEP isn't about speed. It's about correlating data from different data streams. However, chances are that if you're considering using CEP then latency (Figure 2) is a significant concern for developing a successful application. In algorithmic trading, command and control, and fraud detection applications for electronic trades low latency of the entire application solution is critical. In trading desk applications, even the physical proximity of the trading floor to the exchange has become a critical issue. The extra nanosecond that a remote trade takes to register could cost the bank a deal. Many applications that rely on high-performance data streams simply can't afford the periodic (JVM) garbage collection that can degrade overall performance. Messaging products like implementations of OMG's Data Distribution Service, which have been designed and deployed for real-time mission-critical applications, can deliver event samples with a latency an order better than traditional JMS applications. In addition, while the overhead of Data Distribution Service depends on the message size and transport used. For reasonably sized messages and standard transports (100 Mbit-1 Gbit Ethernet) the overhead is typically less than 15% above the raw transport.
•  Managing network bandwidth: Many developers who are considering CEP engines are also concerned with managing network bandwidth efficiently (Figure 3). In trading desk applications, the total volume of stocks traded daily has been following its own Moore's Law, doubling every 18 months since the start of electronic trading. This means the amount of data that must be processed follows the same curve (because the houses have to track all trades, not just their own). In addition, they have internal consumers of the data whose demands are expanding. New trading strategies are being developed that run in parallel with existing ones. Each model requires input and generates output.

Some messaging vendors can manage the network bandwidth more efficiently by packing more event samples into the network pipe and by sending only the relevant data over the network.

For example, with the Data Distribution Service, middleware can apply content-based filtering using SQL-like patterns so that only relevant data is sent over the network.

With pub-sub implementations of the Data Distribution Service, developers can configure the rate at which event samples have to be transmitted on a per data stream basis.

Consider the example of a market-data application that sends all stock ticks to the trading application. However, using CEP, sending a snapshot of the five-second stock high and low may not be required. By configuring a different transmission rate for the stock tick feed, and the "five-second high-low" feed, we can preserve network resources.

In addition, with intelligent network middleware for data streams, we can conserve network bandwidth by determining if the data has to be transmitted at all. For example, with the Data Distribution Service applications can subscribe to topics of interest such as stock news relating to a particular symbol or specific content within the topics of interest. If there are no subscribers for a given topic, RTI won't put the event samples on the network.

•  Quality of Service (QoS) for data streams: QoS refers to a service contract made between two entities. Each data stream has unique attributes or characteristics. For example, a trading desk application may require resending all lost stock ticker change events (reliable transmission). However, a security surveillance application may only require that the data stream for video transmission use best-effort (rather than reliable) techniques. The Data Distribution Service provides a rich set of QoS contracts that can be enforced out-of-the-box. Some examples of such pre-built contracts include ownership strength (selecting the right data source when multiple sources are generating the same data for failover), history (how many event samples are needed for late-joining consumers of information), reliability, time- and content-based filtering of event samples at the source, and persistence (in case the publisher dies and a late-joiner consumer of the information arrives).

•  Number of messages a second: Use cases requiring Complex Event Processing typically demand that messages be transmitted at a rate of 1,000/second to 500,000/sec. This is understandable considering that the CEP application typically integrates with edge devices or, as in trading applications, process a large amount of market data from multiple exchanges to make trend inferences. In such cases high-performance middleware should be capable of intelligently compressing the messages to fit the given system's MTU. Vendors such as 29West and RTI provide performance numbers in this range. For example, with RTI the user can transmit up to 10 million four-byte messages a second.

•  Supporting heterogeneous platforms: From one point-of-view, CEP engines are about integrating data streams from different sources. While different data streams can use their own messaging protocols, this poses a needless headache for application architects by having multiple technology stacks for their data stream implementations. Messaging protocols like JMS are supported on major enterprise platforms, but aren't prevalent in embedded ecosystems where edge devices reside. Traditionally, the Data Distribution Service implements a vast set of architectures including, but not limited to, enterprise platforms like Linux, Solaris, and Windows and embedded platforms like VxWorks, LynxOS, and Integrity operating systems.

Summary
Remember the mid-90s? Some people still assembled their own PCs by purchasing the right combination of motherboards, processors, and disk. But a non-optimal combination of motherboard and processor ensured that you didn't get the right performance from your system. Similarly, while CEP engines herald the promise of delivering high-performance event detection and generation, you need to ensure that they run with data streams with compatible aims of low latency, high availability, throughput, and flexibility.

More Stories By Supreet Oberoi

Supreet Oberoi is vice president, engineering for RTI, and he brings over a decade of experience in building and deploying Web-based enterprise applications. He was a founding member and director of engineering for Trading Dynamics, which was acquired by Ariba in Nov 1999. Later, he led the engineering organization for the Mohr-Davidow (MDV) funded startup, oneREV Inc, that was acquired by Agile Software in 2002. Most recently, Supreet served as Director of Engineering at Agile Software. Supreet received his BS in computer sciences with Highest Honors from the University of Texas at Austin, and an MS in computer sciences from Stanford University.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.