| By Dmitri Tcherevik | Article Rating: |
|
| August 27, 2002 12:00 AM EDT | Reads: |
7,444 |
It's banal but true. The network, or the Internet, has indeed become a computer. It's possible now to assemble an application from network-based components called Web services.
The network serves as the platform, or "computer," for such an application. Granted, the new platform is not yet suitable for building robust business applications. Developing and maintaining such applications still requires significant effort. In addition, some of the required Web services are not yet fully defined. At the same time, it is clear now where the industry is going. We even have a new term describing the emerging platform - the Internet Operating System (OS), sometimes also called the Grid.
In this article I'll look at traditional computer and operating system components and map them to components of the Internet OS.
CPU
The Internet can be viewed as a massively parallel and distributed multiprocessor. A single task can be broken down into multiple threads of execution and run in parallel on multiple computers in the network. This approach is known as grid computing. In some cases, when there is no, or limited, coordination of the nodes involved, it has been called peer-to-peer computing. Not every application can be effectively parallelized. The ones that can will leverage enormous computing resources distributed over the Internet.
Peer-to-peer applications recently made a lot of press. We all remember Napster, Gnutella, SETI@Home, and numerous others. Some of them are quite scandalous. In one interesting case, a system administrator who installed a grid-computing screen saver on several machines in a university network was charged with stealing the school's network bandwidth and CPU cycles. In spite of cases like this, and in spite of what music industry execs would like us to believe, grid and peer-to-peer computing offer great potential for solving real business problems. Peer-to-peer systems are being studied at MIT, Berkley, Stanford, and Microsoft Research. Sun Microsystems and IBM recently released grid computing-related products and are working actively in this area.
File Storage
File storage is a very important component in any platform, and the Internet OS is not an exception. The file system of the Internet OS consists of the storage devices of the numerous computers connected to the network. According to recent calculations, the Internet has about 500 million users worldwide. If we take this to be the rough estimate of the number of computers connected to the network, and assume that each machine has one gigabyte of spare disk space, we get a file system with a capacity of 5 * 1017 bytes, or five hundred thousand petabytes - and this is a very conservative estimate. Even if we partition this disk space among many applications, each application still gets a good chunk of storage - storage that is sufficient for saving many copies of the entire library of recorded music and video, for example. It's not a surprise, therefore, that applications leveraging this resource, such as Napster, Gnutella, Kaaza, and others, have become immensely popular with consumers. Recently, we've seen the emergence of business systems operating on the same principle. One could mention products from companies such as NextPage and Jibe. Support for WebDAV, an XML standard that defines how files and folders can be manipulated over the Internet and in J2EE and .NET platforms, is yet another proof of the Internet file system idea gathering momentum.
Companies have traditionally focused on managing file storage concentrated in data centers or distributed over local area networks. In the near future, we can expect introduction of new products related to management of resources of the Internet file system and implementing functions such as distributed search, indexing, replication, fault-tolerance, and others.
Memory Hierarchy
A memory hierarchy is used to bridge the gap between the speed of the CPU and the speed of the IO system. The memory hierarchy of a traditional computer system consists of the storage subsystem, the main memory, and one or more levels of the CPU cache. Multiprocessor machines can be built with distributed shared memory or memory subspaces that are private to each of the CPU modules.
If we consider the Internet as a massively parallel multiprocessor with a vastly distributed file system, then it becomes obvious that it can benefit from a memory hierarchy design of its own. The cache maintained by a Web browser on the local file system is one of the elements in this hierarchy. A Web page can be retrieved from the browser's cache in a matter of milliseconds. The size of this cache is infinitely small, however, compared to the volume of the file storage distributed over the Internet. As a consequence, the probability of a cache miss is very high, and when there is a cache miss, it may take seconds to retrieve the same piece of content from a remote server. In addition, some of the content types, such as video or audio streams, cannot be cached locally and must always be retrieved from a remote server. Obviously, there is a need for an additional level in the memory hierarchy of the Internet platform.
This gap is being filled by content delivery networks (CDN) such as the ones from Akamai, Exodus, or AT&T. A content delivery network caches popular content on edge servers that are located near, in network terms, the end-user machines consuming the content. In addition to CDNs, companies are experimenting with clever peer-to-peer content replication schemes that place content close to where it is used. Instead of playing a movie from a central server, for example, a network node may choose to stream it from one of its nearby peers. Blue Falcon Networks is an example of a company in this space.
The area of content caching, replication, and delivery is still being very actively researched and developed. The near future should bring some interesting products and ideas.
Messaging
Interprocess communication (IPC) and messaging are important components of any platform. Shared memory, pipes, and sockets are examples of low-level IPC mechanisms. They form the foundation for higher-level mechanisms such as JMS, MSMQ, IIOP, SMTP, HTTP, and others. Some of these protocols are universally implemented, such as HTTP. Others tend to be platform or vendor specific, such as MSMQ.
The industry converged on HTTP as the preferred communication mechanism for the new Internet OS. Web-based components composing an Internet application can be deployed on computers with vastly different operating systems. HTTP, in conjunction with other Web standards such as XML and SOAP, ensures that these components can easily exchange data and services. Interoperability of Web services is paramount to the success of the new platform, and several standards bodies are working hard to ensure that interoperability is achieved and then preserved as the platform matures.
Directories
Directories have traditionally been used by operating system and application server components to locate resources and other components. JNDI, for example, is a J2EE directory that is used to register and look up EJBs, RMI servers, JDBC connection pools, and other resources.
A Web-based application must be able to locate two types of resources distributed widely over the Internet: data and services. It can then add value by transforming data with services. The term "data" is somewhat misleading in this context. It's typically used to describe structured information saved in a database. Only a small fraction of information available on the Web matches this description. Most of the information is unstructured and exists in the form of images, documents, audio files, and so on. Therefore, we will follow the industry practice and use the term "content" to denote information resources accessed by Web-based applications. In summary, a Web-based application requires two types of directories: a content directory and a directory of available Web services.
UDDI is one of the fundamental standards of the new Internet OS that define directories of Web services. With the help of this directory, an application can easily locate a service that implements a certain interface. In other words, an application can locate a service based on its functional description. The Internet is vast, and a directory may contain entries for a large number of services that implement identical interfaces. The standard in its current form does not specify how these services can be distinguished based on their nonfunctional characteristics such as latency, price, or mean time to failure. Obviously, some additional work is required in this area.
A single standard describing a universal content directory for the new Internet platform does not yet exist. WebDAV can be used to fill the void when needed, but it does not yet define some of the critical functions, such as content categorization or search. I expect a lot more activity in this space in the near future. Content and services are equally important to a Web-based application.
User Interface
Microsoft Windows, X-Windows, Mac OS X Quartz, Java Swing, KDE, and GNOME are graphical desktop environments that have traditionally been used to develop user interfaces for client/server applications and applications running on a single computer.
While it's possible to build a desktop-based user interface for a Web-based application, they tend to have Web-based user interfaces. This phenomenon is easily explained by the fact that while traditional applications are only available locally or on a LAN, Web-based applications can be accessed from any device, such as a PC, a PDA, or a mobile phone connected to the Internet.
Graphical desktop environments of the client/server world were replaced with portals in the world of Internet-based applications. Just as a desktop environment is capable of aggregating several desktop applications, a Web portal is capable of aggregating user interfaces of multiple Web-based applications. Many of the features found in desktop environments, such as customization, floating windows, drag-and-drop, and others, made their way to portal products. At the same time, many people still prefer desktop environments to Web-based portals for day-to-day tasks such as sending an e-mail or scheduling a meeting. Some work will still be needed to raise the usability and intuitiveness of portal interfaces to that of the desktop environments, and to deliver consistent usability of the interface across many different types of devices.
Security
The list of security services commonly offered by application server and operating system platforms includes user profile management, authentication, authorization, secure communication, and single sign-on to various applications deployed on the platform.
The same list of services will ultimately be offered by the Internet OS to Web-based users and applications. The difference is in the scale. Where systems distributed over a LAN or WAN have to deal with thousands of users, applications deployed on the Web routinely deal with millions of users. In a recent benchmark, CA's CleverPath Portal handled profile information for 2.5 million users, and this number is far from the limit. Public portals such as MSN and Yahoo exhibit even higher levels of scalability.
Single sign-on to Web-based applications and services is critical to ensuring usability of portal interfaces. Microsoft Passport is an example of a single sign-on mechanism used mainly in applications associated with the MSN portal or the Windows platform. Products with similar functionality are available from other vendors and can be used to set up single sign-on domains for Web-based communities of users and applications not related to Microsoft. We can expect a lot more of these products to be offered as services on the Web.
While it is mostly clear now how various aspects of the Internet OS security can be implemented, the standards that would tie together products from different vendors and ensure their interoperability are still largely missing. IBM and Microsoft recently released a standards proposal for Web services security. It's only the first step in a long process.
Transactions
Distributed transaction coordinators became a staple in client/server systems. They can be found in J2EE, COM, CORBA, and other platforms. The purpose of a distributed transaction coordinator is to ensure atomicity, consistency, isolation, and durability (ACID) of multistep transactions spanning multiple systems distributed in a LAN environment.
While ACID properties of transactions are equally important in applications leveraging content and services deployed on the Web, distributed transaction coordinators became largely irrelevant in the context of the Internet platform. They rely on a lot of heavy synchronous network traffic. Communication on the Internet tends to be asynchronous. In addition, latency of network links on the Internet is much higher than that of a LAN, which makes two-phase commit coordination difficult if not impossible. The two-phase commit protocol has not been designed to handle long-running transactions.
The problem of distributed and long-running transaction coordination on the Web is addressed with the help of workflow systems. A multistep transaction spanning several Web services can be represented with a multistep workflow process. A workflow system can use asynchronous messaging to communicate with a remote Web service. All steps in the workflow are atomic and durable and their results are committed immediately. Should any of the steps in the workflow fail, a sequence of compensating steps can be applied to undo changes that have already committed.
A workflow system can be used to coordinate processes involving multiple Web services. These processes can in turn be exposed as higher level services. For example, a trip-planning service can be built as a process coordinating hotel, air, and car reservation services. Many people believe that workflow systems will become the main programming tools for the Internet platform.
Monitoring and Management
Management and monitoring of client/server applications can be done at the application or infrastructure level. In fact, the vast majority of today's applications are managed at the infrastructure level. Instead of managing the program components that implement the business logic, we are managing networks, databases, and application servers that are used to run these components.
This approach cannot be applied to applications deployed on the Internet platform. The infrastructure that is used to run such applications is not only extremely diverse and distributed, but is also controlled by many different and unrelated companies. One can hardly imagine a database agent installed at company X reporting its state to an enterprise management solution like CA's Unicenter node installed at company Y. The situation is complicated by the fact that in the environment where many of the services are replicated and where applications perform late binding to services, it's difficult to outline the bounds of the infrastructure used by an application at any given moment.
These considerations are raising a slew of interesting problems, many of which remain largely unaddressed. The ideas exchanged in the industry revolve around self-organizing networks, management of Web services at the interface level, service level agreements, service level management, and other topics.
Summary
While it's clear that the Internet will become the platform for the next generation of applications, many aspects of this platform still require significant research and development. In particular, the coming months will see a lot of activity and new products in such areas as:
- Grid computing and peer-to-peer systems applications
- Distributed and replicated file system management
- Content caching, replication, and distribution; content delivery networks
- Secure and reliable messaging based on HTTP, SOAP, and XML
- Directories of distributed and replicated content and services
- Ubiquitous user interfaces aggregating Web applications
- Security of Web services, user profile management, and single sign-on
- Management of distributed processes and orchestration of services
- Monitoring and management of applications based on Web content and services
Published August 27, 2002 Reads 7,444
Copyright © 2002 SYS-CON Media, Inc. — All Rights Reserved.
Syndicated stories and blog feeds, all rights reserved by the author.
More Stories By Dmitri Tcherevik
Mr. Tcherevik is a technology strategist in the Office of the CTO. After extensive development experience in Ingres and the Jasmine object database, Mr. Tcherevik led the development of CleverPath Enterprise Content Manager and Advantage Integration Server. Before joining CA, he helped develop a computer chess program under the leadership of Mikhail Botvinik, the renowned world chess champion. Mr. Tcherevik graduated with honors from the Department of Cybernetics at the Moscow Institute of Physics Engineering (MIFI) specializing in distributed systems, databases, and logic programming.
- The Top 150 Players in Cloud Computing
- Commercial vs Federal Cloud Computing
- Why IBM’s Server Chief Got Busted
- Industry Experts Discuss the State of Cloud Computing
- Cloud Expo New York Call for Papers Deadline December 15
- Cloud Computing on Gartner's Top 10 List and SYS-CON Events' 2010 Calendar
- US Federal Government is Major Cloud Computing Innovator
- Google Wave
- Ulitzer.com Named Exclusive "New Media" Sponsor of Cloud Computing Conference & Expo
- Tactical Cloud Computing Panel at 1st Annual GovIT Expo
- Adaptivity & Cloud Computing: Exclusive Q&A with CEO Tony Bishop
- 4th International Cloud Expo: Photo Album
- The Top 150 Players in Cloud Computing
- SYS-CON.TV: Cloud Computing Expo Power Panel
- Commercial vs Federal Cloud Computing
- Why IBM’s Server Chief Got Busted
- 1st Annual GovIT Expo: Letter from the Technical Chair
- Deputy CIO of the CIA to Keynote 1st Annual GovIT Expo
- Industry Experts Discuss the State of Cloud Computing
- SOA World Power Panel on SYS-CON.TV
- CIA was Headed to an Enterprise Cloud All Along: Jill Tummler Singer
- 1st Annual Government IT Conference & Expo: Themes & Topics
- Cloud Expo New York Call for Papers Deadline December 15
- Stock in Focus: Dragon Capital
- The i-Technology Right Stuff
- Who Are The All-Time Heroes of i-Technology?
- Get the Message
- Where Are RIA Technologies Headed in 2008?
- i-Technology Viewpoint: Is Web 2.0 the Global SOA?
- i-Technology Viewpoint: Thinking Outside the VC Box
- ESB Myth Busters: 10 Enterprise Service Bus Myths Debunked
- i-Technology Viewpoint: When to Leave Your First IT Job
- SOA Web Services Edge Conference Coverage on SYS-CON.TV
- Five Reasons Why Web 2.0 Matters
- SYS-CON.TV's "SOA Web Services" and "Enterprise Open Source" Programs To Air in December
- SOA World Conference & Expo SYS-CON.TV Power Panel Live From Times Square









There are a variety of applications that supp...

























