|By Mark Little||
|September 23, 2002 12:00 AM EDT||
Atomic transactions are a well-known technique for guaranteeing consistency in the presence of failures. The ACID properties of atomic transactions ensure that, even in complex business applications, consistency of state is preserved.
Transactions are best viewed as "short-lived" entities operating in a closely-coupled environment, performing stable state changes to the system; they are less well suited for structuring "long-lived" application functions (e.g., running for hours, days, etc.) and running in a loosely coupled environment like the Web. Long-lived atomic transactions (as typically occur in business-to-business interactions) may reduce the concurrency in the system to an unacceptable level by holding on to resources (e.g., locks) for a long time; further, if such an atomic transaction rolls back, much valuable work already performed could be undone. As a result, there have been various extended transactions models where strict ACID properties can be relaxed in a controlled manner. Until recently, translating these models into the world of Web services had not been attempted. However, the OASIS Business Transactions Protocol, specified by a collaboration of several companies, has tried to address this issue.
With the advent of Web services, the Web is being populated by service providers who wish to take advantage of this large B2B space. However, there are still important security and fault-tolerance considerations that must be addressed. One of these is the fact that the Web frequently suffers from failures that can affect both the performance and consistency of applications that run over it.
Atomic transactions are a well-known technique for guaranteeing consistency in the presence of failures. (Note: I will not use the term transaction in place of atomic transaction since in the B2B space this has different connotations.) The ACID properties of atomic transactions (Atomicity, Consistency, Isolation, Durability) ensure that even in complex business applications consistency of state is preserved, despite concurrent accesses and failures. This is an extremely useful fault-tolerance technique, especially when multiple, possibly remote, resources are involved.
The structuring mechanisms available within traditional atomic transaction systems are sequential and concurrent composition of transactions. These mechanisms are sufficient if an application function can be represented as a single atomic transaction. As Web services evolved as a means to integrate processes and applications at an inter-enterprise level, traditional transaction semantics and protocols have proven inappropriate. Web services-based transactions differ from traditional transactions in that they execute over long periods, they require commitments to the transaction to be "negotiated" at runtime, and isolation levels have to be relaxed.
As a result, there have been various extended transactions models, in which strict ACID properties can be relaxed in a controlled manner. Until recently, translating these models into the world of Web services had not been attempted. However, the OASIS Business Transactions Protocol (BTP), specified by a collaboration of several companies, has tried to address this issue. In this article we'll first consider why traditional atomic transactions are insufficient for long-running B2B activities, and then describe how the BTP protocol has attempted to solve these problems.
Why ACID Transactions Are Too Strong
ACID transactions by themselves are inadequate for structuring long-lived applications. To ensure ACID-ity between multiple participants, a multiphase (typically two) consensus mechanism is required (see Figure 1). During the first (preparation) phase, an individual participant must make durable any state changes that occurred during the scope of the atomic transaction, such that these changes can either be rolled back (undone) or committed later once consensus to the transaction outcome has been determined among all participants, i.e., any original state must not be lost at this point, as the atomic transaction could still roll back. Assuming no failures occurred during the first phase (in which case all participants will be forced to undo their changes), in the second (commitment) phase, participants may "overwrite" the original state with the state made durable during the first phase.
In order to guarantee consensus, a two-phase commit is necessarily a blocking protocol. After returning the phase 1 response, each participant that returned a commit response must remain blocked until it has received the coordinator's phase 2 message telling it what to do. Until they receive this message, any resources used by the participant are unavailable for use by other atomic transactions, since to do so may result in non-ACID behavior. If the coordinator fails before delivery of the second phase message these resources remain blocked until it recovers. In addition, if a participant fails after phase 1, but before the coordinator can deliver its final commit decision, the atomic transaction cannot be completed until the participant recovers: all participants must see both phases of the commit protocol in order to guarantee ACID semantics. There is no implied time limit between a coordinator sending the first phase message of the commit protocol and it sending the second, commit phase message; there could be seconds or hours between them.
Therefore, structuring certain activities from long-running atomic transactions can reduce the amount of concurrency within an application or (in the event of failures) require work to be performed again. For example, there are certain classes of application where it is known that resources acquired within an atomic transaction can be released "early," rather than having to wait until the atomic transaction terminates; in the event of the atomic transaction rolling back, however, certain compensation activities may be necessary to restore the system to a consistent state. Such compensation activities (which may perform forward or backward recovery) will typically be application specific, may not be necessary at all, or may be more efficiently dealt with by the application. For example, long-running activities can be structured as many independent, short-duration atomic transactions, to form a "logical" long-running transaction. This structure allows an activity to acquire and use resources for only the required duration of this long-running activity. In Figure 2 an application activity (shown by the dotted ellipse) has been split into many different, coordinated, short-duration atomic transactions. Assume that the application activity is concerned with booking a taxi (t1), reserving a table at a restaurant (t2), reserving a seat at the theater (t3), booking a room at a hotel (t4), and so on. If all of these operations were performed as a single atomic transaction, then resources acquired during t1 would not be released until the atomic transaction has terminated. If subsequent activities t2, t3, etc., do not require those resources, then they will be needlessly unavailable to other clients.
However, if failures and concurrent access occur during the lifetime of these individual transactional activities, then the behavior of the entire "logical long-running transaction" may not possess ACID properties. Therefore, some form of (application-specific) compensation may be required to attempt to return the state of the system to consistency. For example, let's assume that t4 aborts. Further assume that the application can continue to make forward progress, but in order to do so must now undo some state changes made prior to the start of t4 (by t1, t2, or t3). New activities are started; tc1 is a compensation activity that will attempt to undo state changes performed by, say, t2 and t3, which will continue the application once tc1 has completed. tc5' and tc6' are new activities that continue after compensation, e.g. since it was not possible to reserve the theater, restaurant, and hotel, it is decided to book tickets at the cinema. Obviously, other forms of composition are possible.
Properties of a Web Service-Based Transaction
The fundamental question addressed here is what properties must a transaction model possess in order to support business-to-business interactions? To begin to answer that, we need to understand what we mean by a business transaction.
A business relationship is any distributed state maintained by two or more parties and is subject to some contractual constraints previously agreed to by those parties. A business transaction can therefore be considered as a consistent change in the state of a business relationship between parties. Each party in a business transaction holds its own application state corresponding to the business relationship with other parties in that transaction. During the course of a business transaction, this state may change.
In the Web services domain, information about business transactions is communicated in XML documents. However, how those documents are exchanged by the different parties involved (e.g., e-mail or HTTP) may be a function of the environment, type of business relationship, or other business or logistical factors. Therefore, mandating a specific XML carrier protocol may be too restrictive.
Since business relationships imply a level of value to the parties associated by those relationships, achieving some level of consensus among these parties is important. Not all participants within a particular business transaction have to see the same outcome; a specific transaction may possess multiple consensus groups.
In addition to understanding the outcomes, a participant within a business transaction may need to support provisional or tentative state changes during the course of the transaction. Such parties must also support the completion of a business transaction, either through confirmation (final effect) or cancellation (counter-effect). In general, what it means to confirm or cancel work done within a business transaction will be for the participant to determine.
For example, an application may choose to perform changes as provisional effects and make them visible to other business transactions. It may store necessary information to undo these changes at the same time. On confirmation, it may simply discard these "undo", changes, or on cancellation it may apply these "undo" changes. An application can employ such a compensation-based approach or take a conventional "rollback" approach. It is with these properties in mind that we can discuss the Business Transaction Protocol.
The Business Transaction Protocol
B2B interactions may be complex, involving many parties, spanning many different organisations, and potentially lasting for hours or days, e.g., the process of ordering and delivering parts for a computer may involve different suppliers, and may only be considered to have completed once the parts are delivered to their final destination. Most business-to-business collaborative applications require transactional support in order to guarantee consistent outcome and correct execution. These applications often involve long-running computations, loosely coupled systems, and components that do not share data, location, or administration; it is then difficult to incorporate ACID transactions within such architectures. Furthermore, most collaborative business process management systems support complex, long-running processes in which undoing tasks that have already completed may be necessary in order to effect recovery or to choose another acceptable execution path.
For example, an online bookshop may well reserve books for an individual for a specific period of time, but if the individual does not purchase the books within that time period they will be "put back onto the shelf" for others to purchase; to do otherwise could result in the shop never selling a single book. Furthermore, because it is not possible for anyone to have an infinite supply of stock, some examples of online shops may appear to users to reserve items for them, but in fact if other users want to purchase them first they may be allowed to (i.e., the same book may be "reserved" for multiple users concurrently); a user may subsequently find that the item is no longer available, or may have to be ordered especially for them. If these examples were modelled using atomic transactions, then the reservation process would require the book to be locked for the duration of the atomic transaction - it would have to be available, and could not be acquired by (sold to) another user. When the atomic transaction commits, the book will be removed from stock and mailed to the user. However, if a failure occurs during the commitment protocol, the book may remain locked for an indeterminate amount of time (or until manual intervention occurs).
As a result, the use of traditional atomic transactions with strict ACID properties (e.g., systems that implement the JTS specification [SUN99]) is considered too restrictive for many types of applications.
The Organization for the Advancement of Structured Information Standards (OASIS) Business Transaction Protocol (BTP) is a transaction protocol that meets the requirement for Web-based, long-running collaborative business applications. BTP is designed to support applications that are disparate in time, location, and administration and thus require transactional support beyond classical ACID transactions. In short, BTP is a protocol for ensuring consistent outcomes from participating parties in a business transaction.
Note: It is important to realize that the term "transaction" in this sense does not mean atomic transaction, although ACID semantics can be obtained if required.
Consensus of Opinion
In general, a business transaction requires the capability for certain participants to be structured into a consensus group such that all of the members in a grouping have the same result. Different participants within the same business transaction may belong to different consensus groups. The business logic then controls how each group completes. In this way, a business transaction may cause a subset of the groups it naturally creates to perform the work it asks, while asking the other groups to undo the work.
Consider the situation shown in Figure 4, in which a user is booking a holiday, has provisionally reserved a flight ticket and taxi to the airport, and is now looking for travel insurance. The first consensus group holds Flights and Taxi, since neither of these can occur independently. The user may then decide to visit multiple insurance sites (called A and B in this example), and as he goes may reserve the quotes he likes. So, A may quote $50, which is just within budget, but the user may want to try B just in case he can find a cheaper price, without losing the initial quote. If the quote from B is less than that from A, the user may cancel A while confirming both the flights and the insurance from B. Each insurance site may therefore occur within its own consensus group. This is not possible when using ACID transactions.
BTP uses a two-phase completion protocol to guarantee atomicity of decisions but does not imply specific implementations. To enforce this distinction, rather than call the second phases of the termination protocol "commit" and "rollback" as is the case in an ACID transaction environment, they are called "confirm" and "cancel" respectively, with the intention of decoupling the phases from any preconceptions of specific backward-compensation implementations.
It's important to stress that although BTP uses a two-phase protocol, it does not imply ACID transactions. How implementations of prepare, confirm, and cancel are provided is a back-end implementation decision. Issues to do with consistency and isolation of data are also back-end choices and not imposed or assumed by BTP. A BTP implementation is primarily concerned with two-phase coordination of abstract entities (participants).
In a traditional transaction system, the application or user has very few verbs with which to control the transaction. Typically, these are "begin," "commit," and "roll back," corresponding to starting a transaction, committing a transaction, and rolling back a transaction respectively. When an application asks for a transaction to commit, the coordinator will execute the entire two-phase commit protocol, as described earlier, before returning an outcome to the application (what BTP terms a closed-top commit protocol). The elapse time between the execution of the first phase and the second phase is typically milliseconds to seconds, but is entirely under the control of the coordinator.
However, the actual two-phase protocol does not impose any restrictions on the time between executing the first and second phases. Obviously, the longer this period takes, the more chance there is for a failure to occur and the longer (critical) resources remain locked or isolated from other users. This is the reason why most ACID transaction systems attempt to keep this time frame to a minimum and why they do not work well in environments like the Web.
BTP, on the other hand, took the approach of allowing the time between these phases to be set by the application by expanding the verbs available to include explicit control over both phases of the term, i.e., "prepare," "confirm," and "cancel" - what BTP terms an open-top commit protocol. The application has complete control over when it can tell a transaction to prepare and, using whatever business logic is required, it can later determine which transaction(s) to confirm or cancel. This ability is a powerful tool for applications.
Atoms and Cohesions
To address the specific requirements of business transactions, BTP introduced two types of extended transactions, both using the open-top completion protocol:
In my next article, I'll take a closer look at the architecture of BTP and how XML is involved in it. I'll also look at the Web services stack and how BTP is used.
DevOps has traditionally played important roles in development and IT operations, but the practice is quickly becoming core to other business functions such as customer success, business intelligence, and marketing analytics. Modern marketers today are driven by data and rely on many different analytics tools. They need DevOps engineers in general and server log data specifically to do their jobs well. Here’s why: Server log files contain the only data that is completely full and accurate in th...
Sep. 1, 2015 07:45 AM EDT Reads: 404
Skeuomorphism usually means retaining existing design cues in something new that doesn’t actually need them. However, the concept of skeuomorphism can be thought of as relating more broadly to applying existing patterns to new technologies that, in fact, cry out for new approaches. In his session at DevOps Summit, Gordon Haff, Senior Cloud Strategy Marketing and Evangelism Manager at Red Hat, discussed why containers should be paired with new architectural practices such as microservices rathe...
Sep. 1, 2015 01:00 AM EDT Reads: 411
SYS-CON Events announced today that G2G3 will exhibit at SYS-CON's @DevOpsSummit Silicon Valley, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. Based on a collective appreciation for user experience, design, and technology, G2G3 is uniquely qualified and motivated to redefine how organizations and people engage in an increasingly digital world.
Aug. 31, 2015 11:00 PM EDT Reads: 505
Any Ops team trying to support a company in today’s cloud-connected world knows that a new way of thinking is required – one just as dramatic than the shift from Ops to DevOps. The diversity of modern operations requires teams to focus their impact on breadth vs. depth. In his session at DevOps Summit, Adam Serediuk, Director of Operations at xMatters, Inc., will discuss the strategic requirements of evolving from Ops to DevOps, and why modern Operations has begun leveraging the “NoOps” approa...
Aug. 31, 2015 10:30 PM EDT Reads: 406
Puppet Labs has announced the next major update to its flagship product: Puppet Enterprise 2015.2. This release includes new features providing DevOps teams with clarity, simplicity and additional management capabilities, including an all-new user interface, an interactive graph for visualizing infrastructure code, a new unified agent and broader infrastructure support.
Aug. 31, 2015 06:30 PM EDT Reads: 526
Early in my DevOps Journey, I was introduced to a book of great significance circulating within the Web Operations industry titled The Phoenix Project. (You can read our review of Gene’s book, if interested.) Written as a novel and loosely based on many of the same principles explored in The Goal, this book has been read and referenced by many who have adopted DevOps into their continuous improvement and software delivery processes around the world. As I began planning my travel schedule last...
Aug. 31, 2015 06:00 PM EDT Reads: 546
Several years ago, I was a developer in a travel reservation aggregator. Our mission was to pull flight and hotel data from a bunch of cryptic reservation platforms, and provide it to other companies via an API library - for a fee. That was before companies like Expedia standardized such things. We started with simple methods like getFlightLeg() or addPassengerName(), each performing a small, well-understood function. But our customers wanted bigger, more encompassing services that would "do ...
Aug. 31, 2015 04:00 PM EDT Reads: 271
SYS-CON Events announced today that DataClear Inc. will exhibit at the 17th International Cloud Expo®, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. The DataClear ‘BlackBox’ is the only solution that moves your PC, browsing and data out of the United States and away from prying (and spying) eyes. Its solution automatically builds you a clean, on-demand, virus free, new virtual cloud based PC outside of the United States, and wipes it clean...
Aug. 31, 2015 01:45 PM EDT Reads: 429
Docker containerization is increasingly being used in production environments. How can these environments best be monitored? Monitoring Docker containers as if they are lightweight virtual machines (i.e., monitoring the host from within the container), with all the common metrics that can be captured from an operating system, is an insufficient approach. Docker containers can’t be treated as lightweight virtual machines; they must be treated as what they are: isolated processes running on hosts....
Aug. 31, 2015 01:00 PM EDT Reads: 160
Culture is the most important ingredient of DevOps. The challenge for most organizations is defining and communicating a vision of beneficial DevOps culture for their organizations, and then facilitating the changes needed to achieve that. Often this comes down to an ability to provide true leadership. As a CIO, are your direct reports IT managers or are they IT leaders? The hard truth is that many IT managers have risen through the ranks based on their technical skills, not their leadership ab...
Aug. 31, 2015 12:30 PM EDT Reads: 364
SYS-CON Events announced today that HPM Networks will exhibit at the 17th International Cloud Expo®, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. For 20 years, HPM Networks has been integrating technology solutions that solve complex business challenges. HPM Networks has designed solutions for both SMB and enterprise customers throughout the San Francisco Bay Area.
Aug. 31, 2015 11:30 AM EDT Reads: 899
Introducing Containers & Microservices Bootcamp at @CloudExpo Silicon Valley | #Containers #Microservices
SYS-CON Events announced today the Containers & Microservices Bootcamp, being held November 3-4, 2015, in conjunction with 17th Cloud Expo, @ThingsExpo, and @DevOpsSummit at the Santa Clara Convention Center in Santa Clara, CA. This is your chance to get started with the latest technology in the industry. Combined with real-world scenarios and use cases, the Containers and Microservices Bootcamp, led by Janakiram MSV, a Microsoft Regional Director, will include presentations as well as hands-on...
Aug. 31, 2015 10:45 AM EDT Reads: 353
The pricing of tools or licenses for log aggregation can have a significant effect on organizational culture and the collaboration between Dev and Ops teams. Modern tools for log aggregation (of which Logentries is one example) can be hugely enabling for DevOps approaches to building and operating business-critical software systems. However, the pricing of an aggregated logging solution can affect the adoption of modern logging techniques, as well as organizational capabilities and cross-team ...
Aug. 31, 2015 10:30 AM EDT Reads: 402
SYS-CON Events announced today that Pythian, a global IT services company specializing in helping companies leverage disruptive technologies to optimize revenue-generating systems, has been named “Bronze Sponsor” of SYS-CON's 17th Cloud Expo, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. Founded in 1997, Pythian is a global IT services company that helps companies compete by adopting disruptive technologies such as cloud, Big Data, advance...
Aug. 31, 2015 10:15 AM EDT Reads: 312
In today's digital world, change is the one constant. Disruptive innovations like cloud, mobility, social media, and the Internet of Things have reshaped the market and set new standards in customer expectations. To remain competitive, businesses must tap the potential of emerging technologies and markets through the rapid release of new products and services. However, the rigid and siloed structures of traditional IT platforms and processes are slowing them down – resulting in lengthy delivery ...
Aug. 31, 2015 09:45 AM EDT Reads: 603
In his session at 17th Cloud Expo, Ernest Mueller, Product Manager at Idera, will explain the best practices and lessons learned for tracking and optimizing costs while delivering a cloud-hosted service. He will describe a DevOps approach where the applications and systems work together to track usage, model costs in a granular fashion, and make smart decisions at runtime to minimize costs. The trickier parts covered include triggering off the right metrics; balancing resilience and redundancy ...
Aug. 31, 2015 08:00 AM EDT Reads: 237
Whether you like it or not, DevOps is on track for a remarkable alliance with security. The SEC didn’t approve the merger. And your boss hasn’t heard anything about it. Yet, this unruly triumvirate will soon dominate and deliver DevSecOps faster, cheaper, better, and on an unprecedented scale. In his session at DevOps Summit, Frank Bunger, VP of Customer Success at ScriptRock, will discuss how this cathartic moment will propel the DevOps movement from such stuff as dreams are made on to a prac...
Aug. 31, 2015 04:00 AM EDT Reads: 236
It’s been proven time and time again that in tech, diversity drives greater innovation, better team productivity and greater profits and market share. So what can we do in our DevOps teams to embrace diversity and help transform the culture of development and operations into a true “DevOps” team? In her session at DevOps Summit, Stefana Muller, Director, Product Management – Continuous Delivery at CA Technologies, answered that question citing examples, showing how to create opportunities for ...
Aug. 31, 2015 03:00 AM EDT Reads: 492
What does “big enough” mean? It’s sometimes useful to argue by reductio ad absurdum. Hello, world doesn’t need to be broken down into smaller services. At the other extreme, building a monolithic enterprise resource planning (ERP) system is just asking for trouble: it’s too big, and it needs to be decomposed.
Aug. 29, 2015 10:00 AM EDT Reads: 361
The Microservices architectural pattern promises increased DevOps agility and can help enable continuous delivery of software. This session is for developers who are transforming existing applications to cloud-native applications, or creating new microservices style applications. In his session at DevOps Summit, Jim Bugwadia, CEO of Nirmata, will introduce best practices, patterns, challenges, and solutions for the development and operations of microservices style applications. He will discuss ...
Aug. 27, 2015 02:15 PM EDT Reads: 525