|By Anshuk Pal Chaudhari, Bijoy Majumdar, Sunny Saxena||
|February 26, 2007 04:30 PM EST||
Most organizations that have tried have been successful in implementing a pliable Service Oriented Architecture (SOA) paradigm. Analysts have come out with strategies to translate existing applications into SOA-compliant systems using a staggered approach. The rewards reaped come in the form of low-cost maintenance and agility in their business, along with reusable and self-contained services. But there are still challenges in this form of service-based architecture and solutions need to be devised.
One of the biggest hurdles has been coordinating technology-agnostic services into a single long-running unit of work that produces predictable results. The transactions running across multiple services over multiple domains need to be synchronized to maintain business integrity. Currently organizations depend on proprietary solutions to coordinate transactions for data consistency. This article will walk you through the definition of long-running transaction in SOA and its challenges then talk about the various approaches to resolving the issue while retaining the characteristics of a service-based architecture.
Applications utilize multiple services across different modules or layers to serve a particular business need. For example, security authentication, service information, EIS information, updating services need to coordinate in a business unit termed a transaction thatcomprehends data consistency and business integrity in an organization.
Transactions are a set of operations that must be executed entirely or not at all. The fault-tolerance mechanism of managing transactions is to maintain the so-called ACID properties: A - Atomicity (all or none), C - Consistency (the resource must start and end in a consistent state), I - Isolation (make the transactions appear isolated from all the other operations) and D - Durability (once notified, the transaction will persist). ACID provides concurrency in operations and retains data integrity.
ACID properties are easier to implement on transactions that run only a short time because during a transaction the resources are held in a locked state. Transactions that run for a long time can't afford to lock up resources. Till date, an ACID transaction assumes closely coupled systems that aren't an SOA-mandated environment. So the ACID properties in a long-running activity need to be applied so that locking doesn't occur, or if it does, then the duration of the locking is as short as possible.
Long-Running Transactions in Service-Crowded Systems
To understand the concept of a long-running transaction, we need to look first into the various lifetimes of a transaction. A transaction lifetime can be defined as the minimum amount of time a transaction is kept open. This time period can be anywhere from a few seconds to several hours. A transaction with a short lifetime can begin and end in a matter of seconds, while a long-running transaction can be alive for minutes, hours, even days depending upon the underlying business requirements and implementation. Transactions with a short lifetime are easy to handle since the resources they use can be locked for the time required by maintaining the ACID properties. But the same strategy can't be applied to long-running transactions. Locking up resources for a long time can seriously hamper the application's performance bringing in unnecessary deadlock situations and long wait-times. Any transaction left in an open state for an indefinite amount of time qualifies as a long-running transaction.
The following scenarios make long-running transactions possible:
- A transaction with lots of database queries
- B2B transactions
- Batch processes
- Pseudo-Asynchronous activities within a transaction
Batch processes run for long periods of time, usually for hours. Regularly backing up sensitive data is an example. In most cases, batch processes only involve reading data and hence not many transactional issues are encountered. But in certain cases these long batch processes can include modifying the data. A failure during that operation would require an equally long rollback process.
Pseudo-asynchronous activities are used in concurrent activities but the transactions are resumed at some kind of notification. Such operations can be trivial to handle as the control is passed on completely and there is a complex or no way back to reach the sender once the activity is completed. This results in a complex scenario where an independent or intelligently handled rollback needs to be initiated on the source.
In a SOA each functionality is separated as a service. So, a certain application may use many services to provide a defined functionality. The principles of SOA define services as separate platform- and system-independent entities - each capable of functioning on their own, thus providing reusability and interoperability across disparate platforms.
A long-running transaction creates a number of problems in a SOA architecture. As long as a transaction is limited to a closed environment, catching faults or exceptions and triggering the appropriate rollback mechanism can easily be defined in the underlying application architecture. For example, a transaction involving a database as a resource would already have mechanisms defined in it to handle errors and do rollbacks. Even in a distributed database environment these things can be taken care of. Imagine the same situation in an open SOA scenario where each transactional query is executed on an altogether different platform or system. How a rollback would be implemented in such a case is just one of the immediate questions that comes to mind.
Let's consider a scenario where the transaction involves the participation of three different services - each performing a particular operation. Only if all three operations are successful would the transaction be deemed a success. Any other outcome would result in the transaction being marked as a failure. Then, if and when the transaction fails, appropriate recovery measures have to be implemented. And to top it off, we can lock a resource only for the time when the service local to the resource is operating.
Let's look into the problems encountered with long-running transactions in SOA. They can be referred to as failure cases:
- The participation of multiple services results in multiple endpoints being invoked during one cycle of the transaction. Any of these services can be down at the instance when the transaction is in process.
- SOA boasts of loosely coupled systems. Maintaining transactions is only possible in closely coupled systems.
- The services involved can be based on any platform. Because of the disparity among the underlying implementation of the services, a context can't be deployed across the services to manage the transactions.
- The current status in the flow of the transaction can't be known at a given instance.
- Ifasynchronous services are involved in the transaction they can't be reached back, unless the service information is explicitly passed on.
- Resources can't be kept in a locked state for long periods of time. To free a resource once the service is done with it, it must release it. Doing this can cause a problem later on if the service fails and a rollback is issued throughout all the services.
- Alternate methods need to be devised to perform the appropriate recovery operations. In most cases these methods are either platform-specific or too dependent on the underlying business process.
Any methodology that tries to implement transaction management for a long-running transaction scenario in a SOA needs to make sure to:
- Uniquely identify a transaction across the various participating services
- Guarantee that the data is delivered and the notifications are sent
- Some compensation must be provided in case something goes wrong during the transaction
- Errors in asynchronous services have to be addressed
- A compensation methodology
- Transaction coordinator
In an ideal situation any changes done during a long-running transaction must be reverted back to the original content in case there's a failure somewhere else along the flow of the transaction. This is precisely what happens in a closed environment and is known as a rollback. In a SOA architecture, a situation might arise where a rollback isn't an option. In that case, instead of a rollback, compensation is provided. For example, in WS choreography, the self-reliant services pass control messages back and forth to notify the participating services of a rollback operation.
Compensation may be defined as the most logical change applied to the resource to maintain data consistency and integrity. How it's constructed depends on the governing business rules and underlying technical implementation of the services. In certain cases, compensation can include a rollback. In the example above, if the transaction fails at the third service (the transaction is uniquely identified by an id throughout its lifetime), we need to perform a compensatory operation at the previous service to negate the effect caused by the transaction. So, if the second service sent out an e-mail announcing that it has implemented the changes, a compensatory operation would be to send another e-mail announcing the failure of the transaction and that the changes have been undone. A synchronous process showcasing the scenario is illustrated in Figure 1.
But what if the services participating in the process are asynchronous, as one would expect in a long-running transaction? One way would be to save states and service information.
Methodology 2: Transaction Coordinator
A more appropriate solution would be to orchestrate the process using a transaction manager or process coordinator. Instead of inter-service communication, the services would be answerable to the coordinator, which in turn would handle all the transaction and compensation scenarios. Once again the transaction will be uniquely identified throughout the transaction cycle by an id. This would help the coordinator perform compensatory operations on the required set of data. The coordinator can manage the service information as well. This would solve any issues with asynchronous services. Figure 2 illustrates the coordinator service. This kind of methodology is used mostly in service orchestration-type applications and is a more centralized approach unlike methodology 1.
Case study - Money Transfer Scenario
Consider a money transfer scenario (Figure 4) where a complete transaction process involves five different services. All five services are separated by virtue of both system and the language of implementation.
The first service, the initiation service, is exposed to the client to pick up the user input. It validates the necessary input parameters and processes the transaction by sequentially executing the credit service and the debit service. Then the system notifies the stakeholder and the internal logs for auditing.
With no transaction context involved in this processing, the services are executed independently with no knowledge of the member service status. There's no way for the executed services to rollback and for specific reasons:
- Service status isn't shared
- Non-availability of co-ordination federation in the processing
- Compensation services for revoking the services
The coordinator receives the input and generates an id to uniquely identify the transaction. An acknowledgement is sent to the initiation service as RECEIVED. The initiation service notifies the client about the start of the process and provides the unique transaction id. The client can use this transaction id to monitor and track the transaction. The initiation service is now ready to take further client input. The coordinator maintains a log to record each operation it performs. The log is created against the transaction id.
After generating the id for the transaction, the coordinator calls the external service of the bank, which accepts the money. This credit service takes the necessary input and starts updating the records in the database. Depending upon the style of the compensation, state information is saved before the update process initiates. Once the update takes place successfully, an acknowledgement to the coordinator is sent. (Figure 3)
The coordinator then logs the changes and proceeds to call the debit service. The debit service makes the necessary changes to the local database to reflect the debit. The debit process follows the same pattern as the credit process. On successful operation, a DEBITED acknowledgement is sent to the coordinator. The coordinator notifies each service involved of successful individual transactions at each step by means enacts the 2PC execution. When there's a failure, the coordinator runs the compensation service for each activity.
The long-running transaction is designed specifically for business interactions that take a long time. The intention is to tie the logical single business-to-business unit of work across heterogeneous domains. Each methodology depends on the architecture of the system and the existing assets in the organization. Technical analysts need to differentiate such special transaction in the SOA study and deal with them through special defined methodologies.
1. William Cox. "Transactional Web Services."
2. Pat Helland. "Why I hate the phrase Long running Transactions..."
3. Wikipedia Atomic Transactions:
Two-phase commit protocol:
The (re?)emergence of Microservices was especially prominent in this week’s news. What are they good for? do they make sense for your application? should you take the plunge? and what do Microservices mean for your DevOps and Continuous Delivery efforts? Continue reading for more on Microservices, containers, DevOps culture, and more top news from the past week. As always, stay tuned to all the news coming from@ElectricCloud on DevOps and Continuous Delivery throughout the week and retweet/favo...
Feb. 6, 2016 08:30 AM EST Reads: 137
In most cases, it is convenient to have some human interaction with a web (micro-)service, no matter how small it is. A traditional approach would be to create an HTTP interface, where user requests will be dispatched and HTML/CSS pages must be served. This approach is indeed very traditional for a web site, but not really convenient for a web service, which is not intended to be good looking, 24x7 up and running and UX-optimized. Instead, talking to a web service in a chat-bot mode would be muc...
Feb. 6, 2016 07:30 AM EST Reads: 146
[session] From Build to Scale: Lifecycle of Microservices By @fortyfivan | @CloudExpo #Microservices
More and more companies are looking to microservices as an architectural pattern for breaking apart applications into more manageable pieces so that agile teams can deliver new features quicker and more effectively. What this pattern has done more than anything to date is spark organizational transformations, setting the foundation for future application development. In practice, however, there are a number of considerations to make that go beyond simply “build, ship, and run,” which changes ho...
Feb. 6, 2016 07:15 AM EST Reads: 118
SYS-CON Events announced today that Interoute, owner-operator of one of Europe's largest networks and a global cloud services platform, has been named “Bronze Sponsor” of SYS-CON's 18th Cloud Expo, which will take place on June 7-9, 2015 at the Javits Center in New York, New York. Interoute is the owner-operator of one of Europe's largest networks and a global cloud services platform which encompasses 12 data centers, 14 virtual data centers and 31 colocation centers, with connections to 195 ad...
Feb. 6, 2016 05:00 AM EST Reads: 312
Continuous Delivery and Release Automation for Microservices By @Anders_Wallgren | @DevOpsSummit #Microservices
As software organizations continue to invest in achieving Continuous Delivery (CD) of their applications, we see increased interest in microservices architectures, which–on the face of it–seem like a natural fit for enabling CD. In microservices (or its predecessor, “SOA”), the business functionality is decomposed into a set of independent, self-contained services that communicate with each other via an API. Each of the services has their own application release cycle, and are developed and depl...
Feb. 5, 2016 06:30 PM EST Reads: 183
With microservices, SOA and distributed architectures becoming more popular, it is becoming increasingly harder to keep track of where time is spent in a distributed application when trying to diagnose performance problems. Distributed tracing systems attempt to address this problem by following application requests across service boundaries, persisting metadata along the way that provide context for fine-grained performance monitoring.
Feb. 5, 2016 03:45 PM EST Reads: 770
The battle over bimodal IT is heating up. Now that there’s a reasonably broad consensus that Gartner’s advice about bimodal IT is deeply flawed – consensus everywhere except perhaps at Gartner – various ideas are springing up to fill the void. The bimodal problem, of course, is well understood. ‘Traditional’ or ‘slow’ IT uses hidebound, laborious processes that would only get in the way of ‘fast’ or ‘agile’ digital efforts. The result: incoherent IT strategies and shadow IT struggles that lead ...
Feb. 5, 2016 03:00 PM EST Reads: 406
SYS-CON Events announced today that AppNeta, the leader in performance insight for business-critical web applications, will exhibit and present at SYS-CON's @DevOpsSummit at Cloud Expo New York, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. AppNeta is the only application performance monitoring (APM) company to provide solutions for all applications – applications you develop internally, business-critical SaaS applications you use and the networks that deli...
Feb. 5, 2016 01:30 PM EST Reads: 313
SYS-CON Events announced today that Commvault, a global leader in enterprise data protection and information management, has been named “Bronze Sponsor” of SYS-CON's 18th International Cloud Expo, which will take place on June 7–9, 2016, at the Javits Center in New York City, NY, and the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Commvault is a leading provider of data protection and information management...
Feb. 5, 2016 01:30 PM EST Reads: 329
SYS-CON Events announced today that Alert Logic, Inc., the leading provider of Security-as-a-Service solutions for the cloud, will exhibit at SYS-CON's 18th International Cloud Expo®, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. Alert Logic, Inc., provides Security-as-a-Service for on-premises, cloud, and hybrid infrastructures, delivering deep security insight and continuous protection for customers at a lower cost than traditional security solutions. Ful...
Feb. 5, 2016 01:15 PM EST Reads: 321
If we look at slow, traditional IT and jump to the conclusion that just because we found its issues intractable before, that necessarily means we will again, then it’s time for a rethink. As a matter of fact, the world of IT has changed over the last ten years or so. We’ve been experiencing unprecedented innovation across the board – innovation in technology as well as in how people organize and accomplish tasks. Let’s take a look at three differences between today’s modern, digital context...
Feb. 5, 2016 01:00 PM EST Reads: 134
SYS-CON Events announced today that VAI, a leading ERP software provider, will exhibit at SYS-CON's 18th International Cloud Expo®, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. VAI (Vormittag Associates, Inc.) is a leading independent mid-market ERP software developer renowned for its flexible solutions and ability to automate critical business functions for the distribution, manufacturing, specialty retail and service sectors. An IBM Premier Business Part...
Feb. 5, 2016 12:00 PM EST Reads: 519
SYS-CON Events announced today that Catchpoint Systems, Inc., a provider of innovative web and infrastructure monitoring solutions, has been named “Silver Sponsor” of SYS-CON's DevOps Summit at 18th Cloud Expo New York, which will take place June 7-9, 2016, at the Javits Center in New York City, NY. Catchpoint is a leading Digital Performance Analytics company that provides unparalleled insight into customer-critical services to help consistently deliver an amazing customer experience. Designed...
Feb. 5, 2016 11:00 AM EST Reads: 299
In the Bimodal model we find two areas of IT - the traditional kind where the main concern is keeping the lights on and the IT focusing on agility and speed, where everything needs to be faster. Today companies are investing in new technologies and processes to emulate their most agile competitors. Gone are the days of waterfall development and releases only every few months. Today's IT and the business it powers demands performance akin to a supercar - everything needs to be faster, every sc...
Feb. 5, 2016 08:00 AM EST Reads: 498
At the heart of the Cloud Native model is a microservices application architecture, and applying this to a telco SDN scenario offers enormous opportunity for product innovation and competitive advantage. For example in the ETSI NFV Ecosystem white paper they describe one of the product markets that SDN might address to be the Home sector. Vendors like Alcatel market SDN-based solutions for the home market, offering Home Gateways – A virtual residential gateway (vRGW) where service provider...
Feb. 5, 2016 07:00 AM EST Reads: 130
Web performance issues and advances have been gaining a stronger presence in the headlines as people are becoming more aware of its impact on virtually every business, and 2015 was no exception. We saw a myriad of major outages this year hit some of the biggest corporations, as well as some technology integrations and other news that we IT Ops aficionados find very exciting. This past year has offered several opportunities for growth and evolution in the performance realm — even the worst failu...
Feb. 3, 2016 10:00 PM EST Reads: 533
Are you someone who knows that the number one rule in DevOps is “Don’t Panic”? Especially when it comes to making Continuous Delivery changes inside your organization? Are you someone that theorizes that if anyone implements real automation changes, the solution will instantly become antiquated and be replaced by something even more bizarre and inexplicable?
Feb. 3, 2016 06:30 PM EST Reads: 307
Welcome to the first top DevOps news roundup of 2016! At the end of last year, we saw some great predictions for 2016. While we’re excited to kick off the new year, this week’s top posts reminded us to take a second to slow down and really understand the current state of affairs. For example, do you actually know what microservices are – or aren’t? What about DevOps? Does the emphasis still fall mostly on the development side? This week’s top news definitely got the wheels turning and just migh...
Feb. 3, 2016 03:00 PM EST Reads: 284
Test automation is arguably the most important innovation to the process of QA testing in software development. The ability to automate regression testing and other repetitive test cases can significantly reduce the overall production time for even the most complex solutions. As software continues to be developed for new platforms – including mobile devices and the diverse array of endpoints that will be created during the rise of the Internet of Things - automation integration will have a huge ...
Feb. 3, 2016 02:00 PM EST Reads: 636
Providing a full-duplex communication channel over a single TCP connection, WebSocket is the most efficient protocol for real-time responses over the web. If you’re utilizing WebSocket technology, performance testing will boil down to simulating the bi-directional nature of your application. Introduced with HTML5, the WebSocket protocol allows for more interaction between a browser and website, facilitating real-time applications and live content. WebSocket technology creates a persistent conne...
Feb. 3, 2016 07:00 AM EST Reads: 309