| By Supreet Oberoi | Article Rating: |
|
| October 1, 2007 01:00 PM EDT | Reads: |
6,480 |
With the evolution of distributed IT systems and the advent of Web Services, applications can now make more informed decisions by using real-time information from third-party sources. For example, today's automated trading applications can make over 30 trades a second by analyzing stock trends, market movements, news, and events that may only be relevant for a fraction of a second. A trading algorithm may infer a negative stock trend line for the next tenth of a second and may short the stock for that bit of time.
Such applications require Complex Event Processing (CEP) engines. These engines can detect patterns of activity from multiple data streams and infer events continuously. Many critical CEP use cases require peak performance from both the event engine and the data streams. In the example above, the trading algorithm can't profit from making buys and sells in a tenth of the second if the event engine and the data stream latency exceed the operable time window.
This article will survey the suitability of OMG's Data Distribution Service data streams in use cases that demand high performance from complex event processing systems.
What Is Complex Event Processing?
Consider the following example: The Securities and Exchange Commission has different margin and reserve requirements for traders classified as "pattern day traders." The term "pattern day trader" means any customer who executes four or more day trades in the same stock in five business days. However, if the number of day trades is 6% or less of his total trades for the five-business-day period, the customer won't be considered a pattern day trader and the special margin and reserve requirements won't apply.
With Complex Event Processing, we can detect a "pattern day trader" in real-time as follows. Each time a trader makes a transaction, the system posts an executedTradeEvent <traderId, stockId> event. The CEP engine maintains a window of five days and searches for cases where the count (count_n) of executedTradeEvents for a given trader and a stock exceeds four. Detecting such a pattern it counts the total number of trades done in the last five days. If count_n is more than 6% of all trades in the last five days then it posts a detectedDayTrader event.
Based on this example, we can describe many key characteristics of Complex Event Processing:
• Events are inferred. In the example, the trading application doesn't and can't send a detectedDayTrader event. This event has to be inferred by processing other events like executedTradeEvent, maintaining a count of events satisfying a query condition over a given period of time and integrating with non-streaming content like the number of total trades stored in a relational database.
• The system correlated the data from multiple sources to infer the event. In the example, it included explicit sources like the database that stored the total number of trades for a customer and implicit sources like time.
• The system wouldn't have sent a detectedDayTrader if there wasn't a set of executedTradeEvents preceding it. The executedTradeEvent plus some other criteria caused the detectedDayTrader event to occur.
CEP engines manage event-driven information systems by employing techniques such as detecting complex patterns, building correlations, and relationships such as causality and timing between many events.
From a black box view, a CEP engine takes as input a set of input streams like RTI, database files, or JMS. Most CEP engines use an SQL-like programming language such as the Continuous Computation Language (CCL) with extensions for event processing.
While discussing the entire semantics of CCL or CEP is beyond the scope of this article, here's a simple example of how application developers can infer a weather event using the CCL programming language.
In this example, the query searches for events from the input data stream WindIn in which the wind speed changes by more than five miles an hour in two seconds. The events matching the pattern are then inserted into an output data stream WindPatternOut:
INSERT INTO WindPatternOut (Location, Speed1, Speed2)
SELECT W1.Location, W1.WindSpeed, W2.WindSpeed
FROM WindIn W1, WindIn W2
MATCHING [2 SECONDS: W1 && W2]
ON W1.Location = W2.Location
WHERE (W1.WindSpeed - W2.WindSpeed) >= 5;
As you can see from this example, CCL is loosely based on SQL semantics with extensions (such as matching patterns, viewing samples in a given time window) for complex events. The input and the output data streams are logically modeled as database tables, regardless of the underlying messaging protocol.
Why Use Complex Event Processing?
From an oversimplified view, applications have implemented functionality for inferring events from existing data for a very long time. Credit and fraud-risk applications, for example, have existed for decades without an explicit design and implementation for CEP. However, the data avalanche produced by edge devices like Radio Frequency Identification (RFID) readers and sensors is rapidly changing design-detection algorithms and the need for a configurable and flexible way to detect patterns is becoming more vital.
Here are some situations in which software architects might consider incorporating CEP into their application technology stack:
• Only the "processed" data is useful: In applications where edge devices such as sensors or RFID readers connect to the enterprise, all the raw data samples aren't of equal interest to the enterprise business process. The data might need to be cleansed, validated, and enriched before it's useful. In the wind example, the application isn't interested in each sensor read of the wind speed. The application is only interested when the wind speed changes by more than five miles an hour in two seconds, possibly inferring a hurricane or tornado.
• Software development cycles can't keep up with the changes in algorithms for detecting patterns: In trading applications, patterns for detecting buy and sell events may only be competitive for a few months, or even a few weeks. In some cases, new patterns are discovered, implemented, and deployed in a day. In such cases, it's necessary to parameterize and abstract out the pattern detection layer of the application. Having the trading algorithm embedded in the application code is not a good design practice.
• Event processing has to be done in real-time: Not all event processing requires a CEP engine. Data warehouse applications analyze trends by correlating multiple dimensions of the data in an "offline" manner (for example, send Home Depot discount promotions to all residents who have moved to zip code 95138 in the last two months). However, in many applications, events have to be inferred in real-time and the applications simply can't wait for the raw data to be persisted and analyzed. For example, radar tracking applications must process events in real-time. A trading desk application seeks a competitive advantage by analyzing an event microseconds ahead of its competition. Traditional event-detection applications require that the data is persisted first and then correlated. This methodology is too slow for applications such as trading desks where the data has to be analyzed for event patterns in real-time.
• Event processing has to scale: The media is now heralding the arrival of truly ubiquitous computing, where tiny microprocessors in our ambient surroundings communicate to deliver a more intelligent service. For example, in healthcare monitoring, pressure sensors in the shoes of patients at risk of a heart attack can monitor any change in their pattern of walking and alert the healthcare provider, who can immediately schedule a check-up. Enterprise applications that integrate with the edge have to cope with a large number of sensors simultaneously sending high rates of data. Traditional applications won't be able to handle this impedance mismatch of high data rate and volume. Using a CEP engine designed for performance, one can take steps to manage the load. Selecting the right messaging protocol for the data stream is the other step.
Published October 1, 2007 Reads 6,480
Copyright © 2007 SYS-CON Media, Inc. — All Rights Reserved.
Syndicated stories and blog feeds, all rights reserved by the author.
More Stories By Supreet Oberoi
Supreet Oberoi is vice president, engineering for RTI, and he brings over a decade of experience in building and deploying Web-based enterprise applications. He was a founding member and director of engineering for Trading Dynamics, which was acquired by Ariba in Nov 1999. Later, he led the engineering organization for the Mohr-Davidow (MDV) funded startup, oneREV Inc, that was acquired by Agile Software in 2002. Most recently, Supreet served as Director of Engineering at Agile Software. Supreet received his BS in computer sciences with Highest Honors from the University of Texas at Austin, and an MS in computer sciences from Stanford University.
- Big Data in Telecom: The Need for Analytics
- Patterns for Building High Performance Applications
- Microsoft Tries Hadoop on Azure
- Amazon to Fix Some Kindle Fire Problems
- What Motivates Open Standards in the Cloud?
- What to Expect in 2012: Cloud Computing and Open Source Software
- Will PaaS Finally Bring Open Source Love to the Enterprise?
- Ten Hot Trends in Cloud Data for 2012
- Oracle Disaster Recovery Site Hosted by Amazon Cloud
- Cross-Platform Mobile Website Development – a Tool Comparison
- Three Buzzwords That Every CIO Hears but One They Should Listen To
- Write Once Run Anywhere or Cross Platform Mobile Development Tools
- The Future of Cloud Computing: Industry Predictions for 2012
- Make Customer On-Boarding Easy as Paint-by-Numbers for Cloud Services
- Gartner Hype Cycle for Emerging Technologies 2011
- Book Excerpt: Introducing HTML5
- Adobe Sends Flex to the Apache Foundation
- Big Data in Telecom: The Need for Analytics
- Book Excerpt: Java Application Profiling Tips and Tricks
- i-Technology in 2012: Five Industry Predictions
- Patterns for Building High Performance Applications
- Microsoft Tries Hadoop on Azure
- The Next Web Architecture
- Cloud Computing: A Comparison of Computing Models
- The i-Technology Right Stuff
- The Top 150 Players in Cloud Computing
- Who Are The All-Time Heroes of i-Technology?
- Where Are RIA Technologies Headed in 2008?
- Get the Message
- ESB Myth Busters: 10 Enterprise Service Bus Myths Debunked
- i-Technology Viewpoint: Is Web 2.0 the Global SOA?
- i-Technology Viewpoint: Thinking Outside the VC Box
- i-Technology Viewpoint: When to Leave Your First IT Job
- SOA Web Services Edge Conference Coverage on SYS-CON.TV
- SYS-CON.TV's "SOA Web Services" and "Enterprise Open Source" Programs To Air in December
- Five Reasons Why Web 2.0 Matters















