| By Jim Gabriel | Article Rating: |
|
| January 4, 2005 12:00 AM EST | Reads: |
19,253 |
This article describes the increasing importance of metadata in today's service-oriented application landscape, and the consequent fragility inherent in architectures when faced with change.
When we reach the point at which metadata drives the development and maintenance of services, evolving business requirements force us to break open and evolve our metadata first, and then to address the services dependent on that metadata. Most development methodologies and environments are not sufficiently equipped to deal with such metadata-driven change. This article recommends a shift in the way we manage evolution in metadata-driven applications.
SOA and Metadata
A service-oriented architecture (SOA) is a metadata-driven architecture. Metadata is crucial to the development life cycle of Web services because the long term maintainability of the SOA is at risk when the business logic expressed in services is not visible to the IT department at a higher level than in the code itself. However, there are many different kinds of metadata, not all of which are visible in application development environments. Figure 1 illustrates the metadata that we care about in a SOA.
The top half of this diamond represents WSDL and "policy" metadata, which is what most developers think of when we talk about metadata in an SOA. This metadata is described in XML, hence the general understanding that services are XML based. WSDL and policy metadata are low in semantic business information and high in technical information - they provide or facilitate the plumbing that allows the services to function.
The WSDL and policy part of this equation is of low strategic value to the business because it is largely generated. It falls out of any one of a number of application development tools that might be used to design and create services, or is handcrafted according to relatively simple requirements. When change is necessary in the business logic of an SOA, developers seldom need to concern themselves with this XML - it is the visible, accessible part of the iceberg, as it were.
The lower half of this diamond describes the payload, or the messages, that the services must process in order for the business process to succeed. Payloads require a very different and altogether more fragile kind of metadata: XML schemas. Strictly speaking, services with document-centric payloads can operate very well without an external description in XML schema - that is, all payloads have an implicit schema, and there is no requirement to express the schema explicitly in XSD. Without comprehensive metadata describing the payloads, however, implementing changes to a business process quickly resembles the process of looking for a needle in a haystack.
Expose the Underlying Models
To repeat an earlier statement: the long- term maintainability of the SOA is at risk when the business logic expressed in services is not visible to the IT department at a higher level than in the code itself. This is an important mantra when you consider that only message-based, document-centric SOAs are likely to be successful and low cost in the long term as these allow us to rise above the point-to-point, RPC-style application integration of early service-based architectures. The latter tend over time toward time-consuming, error-prone, application-specific, and high-risk maintenance phases.
If you accept that the SOA should be message based, and your long-term goal is to achieve optimum efficiency in the development life cycle, a best practice is to externalize the schemas, expose the models, standardize, and federate. (This, by the way, is the advice currently being propagated by the majority of the world's SOA authorities, such as IBM, Sun Microsystems, Gartner, and so on.) Beyond a certain level of complexity, especially with multiple developers and teams collaborating on the development of services, the only safe way to constrain the business processes of an organization is to make the data model explicitly visible to all architects, developers, and project managers as a coherent set of XML schemas, and then to drive all service development on the basis of those schemas.
This article does not discuss how best to externalize schemas, as this will be the subject of an article in next month's issue. Suffice it to say that rather than attempting to extract or derive a schema from a service, our starting point ought to be the integration of the underlying data models, followed by the development of services. The important assumption for this article is that the messages carried by services are described and constrained by the integrated data model, and expressed in XML Schema. This metadata is of very high semantic and strategic value because it describes the business processes of an organization, as opposed to the plumbing.
Advantages
The advantages of externalized schemas for message-based SOA are as follows:
- Enforceable contracts for processing behavior
- Visible specifications for developers
- Public interfaces for new partners in the SOA
- Schema-based access to standard infrastructure such as parsers, transformation engines, and so on
- Insulation for services from changes to schemas
- Support for business analysts when planning changes
The disadvantages of any metadata-driven application environment are due entirely to the limitations of metadata in general and XML schemas in particular. What are these limitations? In essence, the XML schemas describing payloads are application specific, bespoke metadata that is subject to change, and requires human involvement when it evolves.
Nowadays we must expect schemas to change. Unfortunately, schema families and their associated assets (transformations and so on) present us with horrific redundancy and duplication when we try to evolve them by editing them. In any orchestrated set of Web services used and maintained by multiple development teams - for example, an order-to-invoice trading transaction involving multiple players - the externalized schemas and transformations (probably one of each per service) describe or reference the same data objects over and over again. Modifying any object presents the kind of maintenance nightmare that most of us try very hard to avoid in conventional programming environments.
Versioning and Impact Problems
Managing XML infrastructure is different. When developers modify schema-driven applications by modifying the schemas, two problems arise: first, the new versions of the schemas are no longer in sync with the older versions; and second, the lack of a robust, scientific mechanism for identifying where every object has been defined and referenced forces developers into a manual maintenance exercise. This is generally not a problem when there is only one schema and one developer. For multiple schemas and multiple developers (or worse, multiple teams of developers), you have a very serious risk of conflicting modifications and inconsistency.
The sheer proliferation of references to single objects, coupled with the number of places where objects can be reused, increases the workload of maintenance projects in an exponential curve. Typically, the only people who can carry out maintenance work beyond a certain level of complexity are highly paid system experts who become IT bottlenecks due to the level of manual work involved. Such work can be very tedious.
Risk
The high level of risk inherent in such a situation makes the bottom half of the diamond in Figure 1 equivalent to the hidden 9/10 of the iceberg, which as you will recall from the Titanic is the stuff that sinks supposedly unsinkable ships. It is a sobering thought that if you cannot manage the evolution of the schemas governing the payloads in your SOA project, you may not even have a project in the long term.
Versioning and Extensibility Issues
The first question that most organizations set out to answer at this stage is, "How do we version schemas?" There are many different techniques applicable to the schema versioning problem, ranging from forcing instant incompatibility at one end of the spectrum (and thus forcing systems to upgrade), through to the opposite end of the spectrum where schema constraints are relaxed sufficiently to allow steadily broader ranges of content in service payloads.
These versioning techniques provide enough material for an article in their own right, so we will avoid the temptation to go into too much detail here. A quick summary is that while it is not impossible to version schemas and the systems that depend on them, nothing comes for free, and there are very few robust mechanisms and procedures that work well.
Extensible Schemas
Having experienced the pain of schema versioning, the next question that organizations inevitably ask themselves is this: "Is there a way of designing our schemas from the outset so that they are extensible?" Architects and system analysts are delighted to discover that XML Schema provides various ways of designing in extensibility, thus ensuring that schemas can be modified (read: "extended") without affecting existing systems.
Each extension, however, has the disadvantage of making the schema considerably more complex, with the logical conclusion in the long term that your metadata reaches a level of complexity that is unmanageable. Again, the subject of schema extensibility is large enough to warrant a separate article. For those who are interested, David Orchard has written an excellent article at www.xml.com/lpt/a/2003/12/03/versioning.html.
Impact in Deployed Environments
The consequence of any schema change forms the basis of the next question: "Okay, we've changed our schemas, how do we realign deployed services without any downtime?" Managing the change in metadata is one thing, managing the related changes in application code can be another problem altogether. What are the dependencies, and how can we automate the change management process? Again, in a contained environment with few developers, these issues do not present many problems (at least, nothing that cannot be overseen).
Pain Point for Life-cycle Management
However, the crucial question for life-cycle management is, "At what point does the evolution of your metadata become a challenge that adversely affects the life cycle of your Web services development?" Managing the evolution effectively is critical when the following apply:
- The organization has invested heavily in XML metadata-driven systems
- There are multiple developers and multiple teams working with the metadata
- There is ongoing change
- There is ongoing internal or external integration
Metadata evolution management is not scalable with most current technology. The current technology treats metadata as passive, as a reflection of what exists in the applications landscape. XML, and SOA in particular, are altering the role of metadata, however. Metadata is now active. We build new inter-enterprise and inter-application systems by first agreeing the contracts (the schemas or metadata), and then writing the code. The schemas determine how we program. It logically follows, therefore, that changes to such systems must first be implemented in the metadata and subsequently in the code.
New Mix of Technologies
To support active metadata, we need a new mix of technologies. An enterprise data dictionary platform is necessary to make all service-related metadata centrally visible to developers, wrapped in a development environment that conforms to a model-driven architecture. Such a system allows changes to metadata to be powerfully implemented in one central place (within the model contained in the dictionary) and deployed out to the system via automated processes and generators, as one would expect of a model-driven development environment. The visible metadata for the community of consumers appears as a strongly version-aware and variation-aware enterprise metadata registry.
The following functionality and technology are necessary to support this concept:
- Facilities for loading and managing existing metadata as an integrated data model
- A means to remove all duplication and redundancy in the integrated data model
- Design and development tools for ongoing metadata development or modification
- Impact analysis tools
- Change management
- Fine-grained version control
- Model-driven architecture
- Central repository
- Collaborative development across multiple teams
- Release management
Loading Existing Metadata
To create a model-driven environment for the development, management, and deployment of enterprise metadata, we first need to create an object model. This can be done either manually or by "vacuuming up" all existing schemas, XML representations of application models, and so on.
Remove Duplication and Redundancy
The essence of a true object model of metadata is that it is single source. This means that objects exist only where they have been defined, and all objects are unique. Every possible reference to, or reuse of, an object is managed as a reference link. During an import process, all redundancy and duplication is removed through the enforcement of unique object names in namespaces. Objects that cannot be resolved at import time can be managed as such, for future attention.
A distinction is made here between the objects that constitute schemas, and the schemas themselves. A schema is merely an assembled structure that pulls together objects from an object model and applies certain deployment properties.
Accordingly, importing a schema results in a record of how the schema was assembled, and-separately-a model of the objects in that schema.
Design and Development Tools
Having created a single-source object model, it is imperative that development happens in the model and not in schemas. This part of the process represents the biggest break with the most commonly applied working methods - modifying a schema should never mean editing a schema file when you are working at this level of sophistication or complexity. Rather, editors must be available to allow object-level edits within the context of the object model New schemas are "assembled" from collections of objects in the pool of objects that the model represents. Existing schemas can easily be redeployed without further modification because they are simply descriptions of the objects they contain and not the objects themselves.
Impact Analysis
Whenever an object is modified in an editor, the system is aware of its place in the object model and all instances of its use in assembled schemas. It is therefore possible to generate impact analysis reports that chart exactly what is impacted by any given change.
Change Management
If it is possible to analyze the impact of change, it is possible to automate all or part of the implementation of that change. For pure metadata implementations (i.e., schemas, transformations, etc.), this means creating a clean, identifiable, version-controlled build of the object model and generating out the deployable objects to the required consumers. Such an action should be automated and centrally driven by a build manager or administrator.
Fine-grained Version Control
Object-level edits to metadata make it possible to manage edits to schemas at the object level. Storing an incremented version of an object in such an environment is easy. The great fallacy of version control in schemas, however, is that anyone would want to access single objects according to their version level.
Object-level versioning is important in the concept of schemas because schemas are based on a complete, coherent version of an object model, and we often want to introduce a single change to a single object in a whole family of schemas. Such a change must result in a re-release of the schemas at a new version level. The editing work is possible at the object level, however, and it must therefore be possible to cycle back to pre-edit versions of the model and branch off, or bug-fix the earlier version and merge again with the later version. This is only possible if object-level versioning is supported.
Model-Driven Architecture
The model-driven architecture is essential for supporting the ConstructionAssembly Deployment workflow and paradigm. Only in this way can we move away from modifying an object by physically locating and manually modifying all references to, and instances of reuse of, that object. In a model-driven architecture, modifying the single object in the model is usually sufficient to equip the system with all the information it requires to regenerate all schemas, transformations, etc. where the object is referenced.
Central Repository
A repository mechanism using transaction-aware database technology is necessary to enable all of the above.
Collaborative Development
The following are essential to collaborative development environments:
- Access control
- User administration system with users, groups, roles, and permissions
- Single-user project workspaces on client machines
- Check-in and check-out in the central repository
- Conflict resolution at check-in and check-out time, managed by integrating checked-in tasks in a build mechanism
Release Management
A build mechanism enables us to uniquely identify and version-control an entire object model, including all the schemas, transformations etc., that have been assembled from that model. Thus, a release that is based on a particular build can identify a version-controlled set of all the deployable objects for the model.
Summary
Doing all of the above correctly gives us the perfect solution to the life-cycle management of metadata-driven Web service development. In a nutshell, when metadata drives evolution, we manage the metadata evolution process by applying a model-driven methodology for exposing, standardizing, and federating the metadata. Such an approach makes it possible for all consumers to access metadata cleanly, simply, and safely through an enterprise metadata registry.
Published January 4, 2005 Reads 19,253
Copyright © 2005 SYS-CON Media, Inc. — All Rights Reserved.
Syndicated stories and blog feeds, all rights reserved by the author.
More Stories By Jim Gabriel
Jim Gabriel has authored tens of thousands of pages of technical documentation, ranging from entry-level tutorial material to programmers' reference manuals. He is literate in XML, SGML, and XSL, among others.
- Big Data in Telecom: The Need for Analytics
- Patterns for Building High Performance Applications
- Microsoft Tries Hadoop on Azure
- Amazon to Fix Some Kindle Fire Problems
- What Motivates Open Standards in the Cloud?
- What to Expect in 2012: Cloud Computing and Open Source Software
- Will PaaS Finally Bring Open Source Love to the Enterprise?
- Ten Hot Trends in Cloud Data for 2012
- Oracle Disaster Recovery Site Hosted by Amazon Cloud
- Cross-Platform Mobile Website Development – a Tool Comparison
- Write Once Run Anywhere or Cross Platform Mobile Development Tools
- Three Buzzwords That Every CIO Hears but One They Should Listen To
- The Future of Cloud Computing: Industry Predictions for 2012
- Make Customer On-Boarding Easy as Paint-by-Numbers for Cloud Services
- Gartner Hype Cycle for Emerging Technologies 2011
- Book Excerpt: Introducing HTML5
- Adobe Sends Flex to the Apache Foundation
- Big Data in Telecom: The Need for Analytics
- Book Excerpt: Java Application Profiling Tips and Tricks
- i-Technology in 2012: Five Industry Predictions
- Patterns for Building High Performance Applications
- Microsoft Tries Hadoop on Azure
- The Next Web Architecture
- Cloud Computing: A Comparison of Computing Models
- The i-Technology Right Stuff
- The Top 150 Players in Cloud Computing
- Who Are The All-Time Heroes of i-Technology?
- Where Are RIA Technologies Headed in 2008?
- Get the Message
- ESB Myth Busters: 10 Enterprise Service Bus Myths Debunked
- i-Technology Viewpoint: Is Web 2.0 the Global SOA?
- i-Technology Viewpoint: Thinking Outside the VC Box
- i-Technology Viewpoint: When to Leave Your First IT Job
- SOA Web Services Edge Conference Coverage on SYS-CON.TV
- SYS-CON.TV's "SOA Web Services" and "Enterprise Open Source" Programs To Air in December
- Five Reasons Why Web 2.0 Matters
















