Welcome!

SOA & WOA Authors: Yeshim Deniz, Christopher Keene, Salvatore Genovese, Dave Haynes, Mark O'Neill

Related Topics: XML

XML: Article

Cleaning Up XML

Cleaning Up XML

Garbage in, garbage out - it's an axiom that applies to many aspects of enterprise development, but none more so than building reliable and robust Web applications and integration projects with XML. Since its inception, XML has been seen as the cure-all for every problem related to Web application development. However, poorly written XML can either slow down an integration project, or worse, cause the integration project to collapse.

It's important to understand some of the inefficiencies of XML, as well as how you can "clean up" and prevent the use of poorly written XML in development projects. After all, system performance is only as good as the data received and the instructions given. If errors are contained in the XML, it is more likely than not that the system will crash.

One of the main benefits of XML is that it provides mechanisms for verifying document validity through the use of Document Type Definitions (DTDs) and XML Schemas. When creating an XML document, developers can reference either of these mechanisms from within the document itself. The DTD or schema that is referenced will specify exactly how the XML document is to be processed, which elements and attributes are contained in the document, and the order in which these elements and attributes should be listed.

While referencing DTDs or schemas can guarantee the validity of XML documents, there is no requirement that developers use headers to reference DTDs or schemas at all. In fact, developers need only to follow simple syntax rules in order for an XML document to be "well-formed." However, a well-formed document is not necessarily a valid document. Without referencing either a DTD or a schema, there is no way to verify whether or not the XML document is valid. Therefore, measures must be taken to ensure that XML documents do, in fact, reference a DTD or schema.

To guarantee that an XML document references a DTD or schema, development teams can adopt a rule-based system that can detect and prevent errors within the XML code. Developers can create rules that impose constraints on XML documents to verify validity.

XML is created in plain text and utilizes actual words or phrases that have specific meanings to developers. However, even though XML can be read and written by humans, it does not necessarily mean that humans can understand XML - developers can still create unreadable XML code. An element that has a specific meaning to one developer may be of no use, or make no sense, to another developer.

To prevent ambiguous XML code, development teams must mutually agree upon a standard XML vocabulary. With a standard language in place, developers within a team will be more apt to understand each other's code. Naming conventions can be established that verify whether code follows rules that verify anything from W3C guidelines for a specific language, to team-naming standards, to project-specific design requirements, to the proper usage of custom XML tags.

Although the W3C has made an effort to establish a common language, vocabulary, and protocol for XML, these standards are still in development and are constantly changing. Companies that adhere to proposed standards that are not yet fully mature must be prepared to keep up with any changes of those standards in the future. Without any stability in XML standards, developers are forced to either keep up with the rapid changes or fall behind.

In spite of the chaos of standards that may exist for XML development, the recent release of Basic Profile 1.0 provides some guidance to developers seeking a widely used XML standard. The Web Services Interoperability Organization (WS-I) Basic Profile 1.0 consists of specifications that establish a baseline for interoperable Web services. These specifications provide a common framework for implementing XML and building integration projects. Therefore, developers can be confident that the XML standards they use will not be privy to constant flux and change if they create rules that enforce these standards.

If it is written and used properly, XML offers many advantages that can support reliable and robust Web applications. The key to successfully using XML in an integration project is to first understand the inefficiencies that may cause poorly written XML, then utilize the proper techniques that verify correctness at each level of the implementation.

More Stories By Adam Kolawa

Adam Kolawa is the co-founder and CEO of Parasoft, leading provider of solutions and services that deliver quality as a continuous process throughout the SDLC. In 1983, he came to the United States from Poland to pursue his PhD. In 1987, he and a group of fellow graduate students founded Parasoft to create value-added products that could significantly improve the software development process. Adam's years of experience with various software development processes has resulted in his unique insight into the high-tech industry and the uncanny ability to successfully identify technology trends. As a result, he has orchestrated the development of numerous successful commercial software products to meet growing industry needs to improve software quality - often before the trends have been widely accepted. Adam has been granted 10 patents for the technologies behind these innovative products.

Kolawa, co-author of Bulletproofing Web Applications (Hungry Minds 2001), has contributed to and written over 100 commentary pieces and technical articles for publications including The Wall Street Journal, Java Developer's Journal, SOA World Magazine, AJAXWorld Magazine; he has also authored numerous scientific papers on physics and parallel processing. His recent media engagements include CNN, CNBC, BBC, and NPR. Additionally he has presented on software quality, trends and development issues at various industry conferences. Kolawa holds a Ph.D. in theoretical physics from the California Institute of Technology. In 2001, Kolawa was awarded the Los Angeles Ernst & Young's Entrepreneur of the Year Award in the software category.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.