Welcome!

SOA & WOA Authors: Pete Johnson, Elizabeth White, Carmen Gonzalez, Yeshim Deniz, AppDynamics Blog

Related Topics: SOA & WOA, XML

SOA & WOA: Article

Long-Running Transactions in SOA

Transaction management in long running service based activities

Most organizations that have tried have been successful in implementing a pliable Service Oriented Architecture (SOA) paradigm. Analysts have come out with strategies to translate existing applications into SOA-compliant systems using a staggered approach. The rewards reaped come in the form of low-cost maintenance and agility in their business, along with reusable and self-contained services. But there are still challenges in this form of service-based architecture and solutions need to be devised.

One of the biggest hurdles has been coordinating technology-agnostic services into a single long-running unit of work that produces predictable results. The transactions running across multiple services over multiple domains need to be synchronized to maintain business integrity. Currently organizations depend on proprietary solutions to coordinate transactions for data consistency. This article will walk you through the definition of long-running transaction in SOA and its challenges then talk about the various approaches to resolving the issue while retaining the characteristics of a service-based architecture.

ACID Applications
Applications utilize multiple services across different modules or layers to serve a particular business need. For example, security authentication, service information, EIS information, updating services need to coordinate in a business unit termed a transaction thatcomprehends data consistency and business integrity in an organization.

Transactions are a set of operations that must be executed entirely or not at all. The fault-tolerance mechanism of managing transactions is to maintain the so-called ACID properties: A - Atomicity (all or none), C - Consistency (the resource must start and end in a consistent state), I - Isolation (make the transactions appear isolated from all the other operations) and D - Durability (once notified, the transaction will persist). ACID provides concurrency in operations and retains data integrity.

ACID properties are easier to implement on transactions that run only a short time because during a transaction the resources are held in a locked state. Transactions that run for a long time can't afford to lock up resources. Till date, an ACID transaction assumes closely coupled systems that aren't an SOA-mandated environment. So the ACID properties in a long-running activity need to be applied so that locking doesn't occur, or if it does, then the duration of the locking is as short as possible.

Long-Running Transactions in Service-Crowded Systems
To understand the concept of a long-running transaction, we need to look first into the various lifetimes of a transaction. A transaction lifetime can be defined as the minimum amount of time a transaction is kept open. This time period can be anywhere from a few seconds to several hours. A transaction with a short lifetime can begin and end in a matter of seconds, while a long-running transaction can be alive for minutes, hours, even days depending upon the underlying business requirements and implementation. Transactions with a short lifetime are easy to handle since the resources they use can be locked for the time required by maintaining the ACID properties. But the same strategy can't be applied to long-running transactions. Locking up resources for a long time can seriously hamper the application's performance bringing in unnecessary deadlock situations and long wait-times. Any transaction left in an open state for an indefinite amount of time qualifies as a long-running transaction.

The following scenarios make long-running transactions possible:

  1. A transaction with lots of database queries
  2. B2B transactions
  3. Batch processes
  4. Pseudo-Asynchronous activities within a transaction
In a long running transaction with multiple queries made to the database, the failure of any single query would result in a transaction failure that requires rolling back completely to the previously saved safe state of the database. The scenario might be across different data sources across different administrative domains.

Batch processes run for long periods of time, usually for hours. Regularly backing up sensitive data is an example. In most cases, batch processes only involve reading data and hence not many transactional issues are encountered. But in certain cases these long batch processes can include modifying the data. A failure during that operation would require an equally long rollback process.

Pseudo-asynchronous activities are used in concurrent activities but the transactions are resumed at some kind of notification. Such operations can be trivial to handle as the control is passed on completely and there is a complex or no way back to reach the sender once the activity is completed. This results in a complex scenario where an independent or intelligently handled rollback needs to be initiated on the source.

In a SOA each functionality is separated as a service. So, a certain application may use many services to provide a defined functionality. The principles of SOA define services as separate platform- and system-independent entities - each capable of functioning on their own, thus providing reusability and interoperability across disparate platforms.

A long-running transaction creates a number of problems in a SOA architecture. As long as a transaction is limited to a closed environment, catching faults or exceptions and triggering the appropriate rollback mechanism can easily be defined in the underlying application architecture. For example, a transaction involving a database as a resource would already have mechanisms defined in it to handle errors and do rollbacks. Even in a distributed database environment these things can be taken care of. Imagine the same situation in an open SOA scenario where each transactional query is executed on an altogether different platform or system. How a rollback would be implemented in such a case is just one of the immediate questions that comes to mind.

Let's consider a scenario where the transaction involves the participation of three different services - each performing a particular operation. Only if all three operations are successful would the transaction be deemed a success. Any other outcome would result in the transaction being marked as a failure. Then, if and when the transaction fails, appropriate recovery measures have to be implemented. And to top it off, we can lock a resource only for the time when the service local to the resource is operating.

Troubles Within
Let's look into the problems encountered with long-running transactions in SOA. They can be referred to as failure cases:

  1. The participation of multiple services results in multiple endpoints being invoked during one cycle of the transaction. Any of these services can be down at the instance when the transaction is in process.
  2. SOA boasts of loosely coupled systems. Maintaining transactions is only possible in closely coupled systems.
  3. The services involved can be based on any platform. Because of the disparity among the underlying implementation of the services, a context can't be deployed across the services to manage the transactions.
  4. The current status in the flow of the transaction can't be known at a given instance.
  5. Ifasynchronous services are involved in the transaction they can't be reached back, unless the service information is explicitly passed on.
  6. Resources can't be kept in a locked state for long periods of time. To free a resource once the service is done with it, it must release it. Doing this can cause a problem later on if the service fails and a rollback is issued throughout all the services.
  7. Alternate methods need to be devised to perform the appropriate recovery operations. In most cases these methods are either platform-specific or too dependent on the underlying business process.
Viable Methodologies
Any methodology that tries to implement transaction management for a long-running transaction scenario in a SOA needs to make sure to:
  1. Uniquely identify a transaction across the various participating services
  2. Guarantee that the data is delivered and the notifications are sent
  3. Some compensation must be provided in case something goes wrong during the transaction
  4. Errors in asynchronous services have to be addressed
Keeping these points in mind, the following are some ways to manage long-running transactions in SOA:
  1. A compensation methodology
  2. Transaction coordinator
Methodology 1: Compensation
In an ideal situation any changes done during a long-running transaction must be reverted back to the original content in case there's a failure somewhere else along the flow of the transaction. This is precisely what happens in a closed environment and is known as a rollback. In a SOA architecture, a situation might arise where a rollback isn't an option. In that case, instead of a rollback, compensation is provided. For example, in WS choreography, the self-reliant services pass control messages back and forth to notify the participating services of a rollback operation.

Compensation may be defined as the most logical change applied to the resource to maintain data consistency and integrity. How it's constructed depends on the governing business rules and underlying technical implementation of the services. In certain cases, compensation can include a rollback. In the example above, if the transaction fails at the third service (the transaction is uniquely identified by an id throughout its lifetime), we need to perform a compensatory operation at the previous service to negate the effect caused by the transaction. So, if the second service sent out an e-mail announcing that it has implemented the changes, a compensatory operation would be to send another e-mail announcing the failure of the transaction and that the changes have been undone. A synchronous process showcasing the scenario is illustrated in Figure 1.

But what if the services participating in the process are asynchronous, as one would expect in a long-running transaction? One way would be to save states and service information.

Methodology 2: Transaction Coordinator
A more appropriate solution would be to orchestrate the process using a transaction manager or process coordinator. Instead of inter-service communication, the services would be answerable to the coordinator, which in turn would handle all the transaction and compensation scenarios. Once again the transaction will be uniquely identified throughout the transaction cycle by an id. This would help the coordinator perform compensatory operations on the required set of data. The coordinator can manage the service information as well. This would solve any issues with asynchronous services. Figure 2 illustrates the coordinator service. This kind of methodology is used mostly in service orchestration-type applications and is a more centralized approach unlike methodology 1.

Case study - Money Transfer Scenario
Consider a money transfer scenario (Figure 4) where a complete transaction process involves five different services. All five services are separated by virtue of both system and the language of implementation.

The first service, the initiation service, is exposed to the client to pick up the user input. It validates the necessary input parameters and processes the transaction by sequentially executing the credit service and the debit service. Then the system notifies the stakeholder and the internal logs for auditing.

With no transaction context involved in this processing, the services are executed independently with no knowledge of the member service status. There's no way for the executed services to rollback and for specific reasons:

  1. Service status isn't shared
  2. Non-availability of co-ordination federation in the processing
  3. Compensation services for revoking the services
Let's take one of the approaches and bridge the gaps. The solution is to have a coordinator orchestrating though the services and managing the transaction context. The coordinator invariably stores the status of the services and maintains the transaction supporting the ACID properties.

The coordinator receives the input and generates an id to uniquely identify the transaction. An acknowledgement is sent to the initiation service as RECEIVED. The initiation service notifies the client about the start of the process and provides the unique transaction id. The client can use this transaction id to monitor and track the transaction. The initiation service is now ready to take further client input. The coordinator maintains a log to record each operation it performs. The log is created against the transaction id.

After generating the id for the transaction, the coordinator calls the external service of the bank, which accepts the money. This credit service takes the necessary input and starts updating the records in the database. Depending upon the style of the compensation, state information is saved before the update process initiates. Once the update takes place successfully, an acknowledgement to the coordinator is sent. (Figure 3)

The coordinator then logs the changes and proceeds to call the debit service. The debit service makes the necessary changes to the local database to reflect the debit. The debit process follows the same pattern as the credit process. On successful operation, a DEBITED acknowledgement is sent to the coordinator. The coordinator notifies each service involved of successful individual transactions at each step by means enacts the 2PC execution. When there's a failure, the coordinator runs the compensation service for each activity.

Conclusion
The long-running transaction is designed specifically for business interactions that take a long time. The intention is to tie the logical single business-to-business unit of work across heterogeneous domains. Each methodology depends on the architecture of the system and the existing assets in the organization. Technical analysts need to differentiate such special transaction in the SOA study and deal with them through special defined methodologies.

References
1.  William Cox. "Transactional Web Services."
http://dev2dev.bea.com/pub/a/2004/01/cox.html

2.  Pat Helland. "Why I hate the phrase Long running Transactions..."
http://blogs.msdn.com/pathelland/archive/2004/08/12/213552.aspx

3.  Wikipedia Atomic Transactions:
http://en.wikipedia.org/wiki/Atomic_transaction
Database Transactions:
http://en.wikipedia.org/wiki/Database_transaction
Two-phase commit protocol:
http://en.wikipedia.org/wiki/2-phase_commit
ACID properties:
http://en.wikipedia.org/wiki/ACID

More Stories By Anshuk Pal Chaudhari

The authors are interning and/or working as part of the Web Services COE (Center of Excellence) for Infosys Technologies, a global IT consulting firm, and have substantial experience in publishing papers, presenting papers at conferences, and defining standards for SOA and Web services. The Web Services COE specializes in SOA, Web services, and other related technologies.

More Stories By Bijoy Majumdar

Bijoy Majumdar is a member of the Web Services COE (Center of Excellence) for Infosys Technologies, a global IT consulting firm, and has substantial experience in publishing papers, presenting papers at conferences, and defining standards for SOA and Web services. Prior to Infosys, Bijoy Majumdar worked as an IT Analyst, and had been a member of the GE Center of Excellence (e-center) under the E-Business Practice of Tata Consultancy Services.

More Stories By Sunny Saxena

Sunny Saxena currently works with the Web Services Centre of Excellence in SETLabs, the technology research division at Infosys Technologies, India. His interests range from Web service security platforms to aspect-oriented development models.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


@ThingsExpo Stories
WebRTC defines no default signaling protocol, causing fragmentation between WebRTC silos. SIP and XMPP provide possibilities, but come with considerable complexity and are not designed for use in a web environment. In his session at @ThingsExpo, Matthew Hodgson, technical co-founder of the Matrix.org, discussed how Matrix is a new non-profit Open Source Project that defines both a new HTTP-based standard for VoIP & IM signaling and provides reference implementations.
DevOps Summit 2015 New York, co-located with the 16th International Cloud Expo - to be held June 9-11, 2015, at the Javits Center in New York City, NY - announces that it is now accepting Keynote Proposals. The widespread success of cloud computing is driving the DevOps revolution in enterprise IT. Now as never before, development teams must communicate and collaborate in a dynamic, 24/7/365 environment. There is no time to wait for long development cycles that produce software that is obsolete at launch. DevOps may be disruptive, but it is essential.
Explosive growth in connected devices. Enormous amounts of data for collection and analysis. Critical use of data for split-second decision making and actionable information. All three are factors in making the Internet of Things a reality. Yet, any one factor would have an IT organization pondering its infrastructure strategy. How should your organization enhance its IT framework to enable an Internet of Things implementation? In his session at Internet of @ThingsExpo, James Kirkland, Chief Architect for the Internet of Things and Intelligent Systems at Red Hat, described how to revolutioniz...
Scott Jenson leads a project called The Physical Web within the Chrome team at Google. Project members are working to take the scalability and openness of the web and use it to talk to the exponentially exploding range of smart devices. Nearly every company today working on the IoT comes up with the same basic solution: use my server and you'll be fine. But if we really believe there will be trillions of these devices, that just can't scale. We need a system that is open a scalable and by using the URL as a basic building block, we open this up and get the same resilience that the web enjoys.
The Internet of Things is tied together with a thin strand that is known as time. Coincidentally, at the core of nearly all data analytics is a timestamp. When working with time series data there are a few core principles that everyone should consider, especially across datasets where time is the common boundary. In his session at Internet of @ThingsExpo, Jim Scott, Director of Enterprise Strategy & Architecture at MapR Technologies, discussed single-value, geo-spatial, and log time series data. By focusing on enterprise applications and the data center, he will use OpenTSDB as an example t...
Connected devices and the Internet of Things are getting significant momentum in 2014. In his session at Internet of @ThingsExpo, Jim Hunter, Chief Scientist & Technology Evangelist at Greenwave Systems, examined three key elements that together will drive mass adoption of the IoT before the end of 2015. The first element is the recent advent of robust open source protocols (like AllJoyn and WebRTC) that facilitate M2M communication. The second is broad availability of flexible, cost-effective storage designed to handle the massive surge in back-end data in a world where timely analytics is e...
The 3rd International Internet of @ThingsExpo, co-located with the 16th International Cloud Expo - to be held June 9-11, 2015, at the Javits Center in New York City, NY - announces that its Call for Papers is now open. The Internet of Things (IoT) is the biggest idea since the creation of the Worldwide Web more than 20 years ago.
The definition of IoT is not new, in fact it’s been around for over a decade. What has changed is the public's awareness that the technology we use on a daily basis has caught up on the vision of an always on, always connected world. If you look into the details of what comprises the IoT, you’ll see that it includes everything from cloud computing, Big Data analytics, “Things,” Web communication, applications, network, storage, etc. It is essentially including everything connected online from hardware to software, or as we like to say, it’s an Internet of many different things. The difference ...
How do APIs and IoT relate? The answer is not as simple as merely adding an API on top of a dumb device, but rather about understanding the architectural patterns for implementing an IoT fabric. There are typically two or three trends: Exposing the device to a management framework Exposing that management framework to a business centric logic Exposing that business layer and data to end users. This last trend is the IoT stack, which involves a new shift in the separation of what stuff happens, where data lives and where the interface lies. For instance, it's a mix of architectural styles ...
The security devil is always in the details of the attack: the ones you've endured, the ones you prepare yourself to fend off, and the ones that, you fear, will catch you completely unaware and defenseless. The Internet of Things (IoT) is nothing if not an endless proliferation of details. It's the vision of a world in which continuous Internet connectivity and addressability is embedded into a growing range of human artifacts, into the natural world, and even into our smartphones, appliances, and physical persons. In the IoT vision, every new "thing" - sensor, actuator, data source, data con...
The Internet of Things promises to transform businesses (and lives), but navigating the business and technical path to success can be difficult to understand. In his session at @ThingsExpo, Sean Lorenz, Technical Product Manager for Xively at LogMeIn, demonstrated how to approach creating broadly successful connected customer solutions using real world business transformation studies including New England BioLabs and more.
SYS-CON Events announced today that Gridstore™, the leader in hyper-converged infrastructure purpose-built to optimize Microsoft workloads, will exhibit at SYS-CON's 16th International Cloud Expo®, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. Gridstore™ is the leader in hyper-converged infrastructure purpose-built for Microsoft workloads and designed to accelerate applications in virtualized environments. Gridstore’s hyper-converged infrastructure is the industry’s first all flash version of HyperConverged Appliances that include both compute and storag...
P2P RTC will impact the landscape of communications, shifting from traditional telephony style communications models to OTT (Over-The-Top) cloud assisted & PaaS (Platform as a Service) communication services. The P2P shift will impact many areas of our lives, from mobile communication, human interactive web services, RTC and telephony infrastructure, user federation, security and privacy implications, business costs, and scalability. In his session at @ThingsExpo, Robin Raymond, Chief Architect at Hookflash, will walk through the shifting landscape of traditional telephone and voice services ...
"Matrix is an ambitious open standard and implementation that's set up to break down the fragmentation problems that exist in IP messaging and VoIP communication," explained John Woolf, Technical Evangelist at Matrix, in this SYS-CON.tv interview at @ThingsExpo, held Nov 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
An entirely new security model is needed for the Internet of Things, or is it? Can we save some old and tested controls for this new and different environment? In his session at @ThingsExpo, New York's at the Javits Center, Davi Ottenheimer, EMC Senior Director of Trust, reviewed hands-on lessons with IoT devices and reveal a new risk balance you might not expect. Davi Ottenheimer, EMC Senior Director of Trust, has more than nineteen years' experience managing global security operations and assessments, including a decade of leading incident response and digital forensics. He is co-author of t...
We are reaching the end of the beginning with WebRTC, and real systems using this technology have begun to appear. One challenge that faces every WebRTC deployment (in some form or another) is identity management. For example, if you have an existing service – possibly built on a variety of different PaaS/SaaS offerings – and you want to add real-time communications you are faced with a challenge relating to user management, authentication, authorization, and validation. Service providers will want to use their existing identities, but these will have credentials already that are (hopefully) i...
The 3rd International @ThingsExpo, co-located with the 16th International Cloud Expo - to be held June 9-11, 2015, at the Javits Center in New York City, NY - announces that it is now accepting Keynote Proposals. The Internet of Things (IoT) is the most profound change in personal and enterprise IT since the creation of the Worldwide Web more than 20 years ago. All major researchers estimate there will be tens of billions devices - computers, smartphones, tablets, and sensors - connected to the Internet by 2020. This number will continue to grow at a rapid pace for the next several decades.
The Internet of Things will greatly expand the opportunities for data collection and new business models driven off of that data. In her session at @ThingsExpo, Esmeralda Swartz, CMO of MetraTech, discussed how for this to be effective you not only need to have infrastructure and operational models capable of utilizing this new phenomenon, but increasingly service providers will need to convince a skeptical public to participate. Get ready to show them the money!
The 3rd International Internet of @ThingsExpo, co-located with the 16th International Cloud Expo - to be held June 9-11, 2015, at the Javits Center in New York City, NY - announces that its Call for Papers is now open. The Internet of Things (IoT) is the biggest idea since the creation of the Worldwide Web more than 20 years ago.
"There is a natural synchronization between the business models, the IoT is there to support ,” explained Brendan O'Brien, Co-founder and Chief Architect of Aria Systems, in this SYS-CON.tv interview at the 15th International Cloud Expo®, held Nov 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.