| By Daniel Eng | Article Rating: |
|
| December 1, 2007 06:30 PM EST | Reads: |
6,109 |
After reviewing several new alternative approaches, we identified SOA data services as the best one for meeting our criteria.
Data services are a form of Web Service optimized for real-time data integration. Data services virtualize data to decouple physical and logical locations and therefore avoid unnecessary data replication. Data services abstract complex data structures and syntax. Data services federate disparate data into useful composites. Data services also support data integration across both SOA and non-SOA applications.
Architecturally, data services combine to form a middle layer of reusable services, or a data services layer, decoupled from both the underlying source-data layer as well as the consuming solutions layer. This provides the flexibility required to deal with each layer in the most effective manner, as well as the agility to work quickly across layers such as applications, schemas, or underlying data sources change (see Figure 1).
Beyond providing complex multi-source data integration, data services meet our other criteria as well. Because data services are on-demand, they meet our requirement for real-time information delivery. By not replicating data, data services eliminate the time required for building and testing marts. Further, data services can be automatically generated directly from our data models and so don't require coding. Data services, due to abstraction, can often be reused across projects. Finally, data services, because of their architecture, XML support capabilities, and standards compliance, are inherently SOA-compliant.
Data Services Infrastructure Technology Selected
Once we chose a SOA data services approach, we searched for a data
services infrastructure provider that offered development tools and an
appropriate run-time environment. We selected Composite Software. With
more than 20 projects running in various Pfizer divisions and a
Composite Center of Excellence at our headquarters, Composite was a
proven vendor at Pfizer and its best-of-breed offerings met our search
criteria.
Now our overall data integration capabilities include data virtualization, data abstraction, and data federation across both SOA and non-SOA environments. Delivered via Composite's Information Server, the solution supports both our design and run-time requirements. At build time, we have an easy-to-use data modeler and code generator to abstract our data in the form of relational views for reporting and other uses and/or Web data services for SOA initiatives. Its high-performance query engine securely accesses, federates, and delivers the diverse distributed data to our consuming solutions in real-time.
The Proof Was in the Portal
With our data services
strategy and data integration toolset in hand, our next task was to do
a pilot project. We wanted to see if we could successfully complete the
project, and if we could complete it much faster while complying with
SOA principles.
For our pilot, we selected the Drug Discovery Portfolio portal. This project easily met our evaluation criteria.
Business Requirements
Senior management,
project team leaders, business analysts, and research scientists across
Pfizer's R&D and commercial business units need to continuously
evaluate our portfolio of discovery projects and drugs in development.
This analysis includes how these projects fit into Pfizer's overall
strategic portfolio as well as how each will be impacted by costs,
market conditions and available resources. A complete picture of each
particular project, as well as an overview of all the projects, is
needed for major business decisions to be based on all relevant
factors. Real-time access to this information is critical, so Pfizer
can rapidly react to unforeseen events intelligently.
User Interface Requirements
We selected a
Web portal as the user interface because this provides the most
flexible and accessible solution for our wide range of information
users. This means existing data has to be delivered in the form of Web
data services for our portal developers and our portal toolset to
consume easily.
Data Integration Requirements
Key data to
be delivered includes both key metrics and details such as project
costs, resources, timelines and ROI calculations, to name a few. This
diverse data needs to be integrated from a wide variety of source
applications from across various Pfizer groups. This diversity of
source system data structures enabled us to evaluate and thoroughly
test Composite's data connector and transform capabilities during the
pilot project. We also thoroughly tested Composite's high-performance
query algorithms through the dynamic nature of the sources and the need
for real-time delivery. Because many teams from across the globe needed
to be involved to provide access to the right data, we added
ease-of-use to our RAD evaluation criteria.
Pilot Benchmark: The Data Mart Approach
To compare
the relative and absolute strengths and weaknesses of the new data
services approach and the Information Server versus our traditional
approach, we invested in a small benchmark of the "old way."
Benchmarking the functional and technical specifications lets us
compare end solution delivery. Benchmarking the development process
lets us compare time-to-solution and development costs.
Functional and Technical Specification
We
already knew we could use our ETL/data mart tools to successfully
combine the data required into a mart. Unfortunately, putting the
relational data into a mart was only half the job. We still needed to
get this data out of the mart and into the portal in the form of a Web
Service. We found this requires manual coding and an additional
toolset. What's more, to achieve the real-time delivery requirement, we
found we needed to achieve unrealistic refresh rates using highly
complex change data capture techniques.
Development Process
In a side-by-side comparison, Table 1 represents the steps used in an ETL versus a data services approach.
Problems with the Data Mart Approach
The ETL/data mart approach was not ideal for this project for the following reasons:
•
We could only come close to meeting the real-time integration
requirements if we used advanced change data capture and frequent
refresh features.
• We found that the data mart was physically instantiated in a
relational form. Yet, our portal developers wanted the data in the form
of WSDL Web Services that are easier for the portal to consume.
• Sequential development such as building the ETL scripts, the
mart, the delivery scripts, and then the portal application stretched
the elapsed time thereby pushing out business benefits and adding costs.
• ETL and Web Service scripting were slow manual development processes.
• Scheduling the setup of the data mart infrastructure required
coordinating with our operations group, fitting into its schedule and
backlog.
• Replicated data in the mart would need to be maintained and controlled in addition to the original source data.
• Data security requires additional manual coding.
• Any changes required ETL scripts to be changed, as well as the
mart to be reloaded, slowing our response to new requirements or even
simple bug fixes.
• More data structure and syntax expertise was required by developers throughout the process, not just basic SQL.
SOA Data Services Approach Pilot Meets the Spec, Is Faster, and More
The data services approach proved ideal for our Drug Discovery Portfolio Portal project.
• We completed our project in less than half the time of
traditional development. Much of the data-level development was
automated, freeing our skilled development team to work on
application-level development.
• Fewer skills were needed due to the drag-and-drop data service
development environment, built-in security, and automated generation of
Web data services.
• SOA-compliant WSDL data services provided data in the form the portal developers needed.
• Loosely coupled data services were easier to maintain than ETL
scripts in case of changes either to the underlying data sources or the
portal.
• Data service assets built for the portal project can be reused by other development projects.
• We no longer needed our IT operations team to build and maintain
the data mart infrastructure. No extra costs for the mart itself.
Pfizer Informatics Adopts Data Services Approach
Going forward, we plan to use the data services approach and tools for
all projects requiring complex data integration across multiple
heterogeneous sources because the data services approach reduces
unnecessary data replication and provides real-time information
delivery, rapid application development, and SOA compliance.
We learned a number of lessons applicable to future projects. Data integration doesn't have to be hard or time-consuming with the right approach and right supporting tools. Virtualizing data versus replicating saves time and money. Rapid prototyping is possible, even automatic, when the right tools are used. Agility and reuse, the promise of SOA, comes to life in loosely coupled data services that span the gap between source data and end applications.
Moving from Pilot to Enterprise, Funded by Time and Cost Savings
With the new SOA data services approach to data integration proven, we
have now put together our roadmap for future adoption. This roadmap
includes educating our business analysts, developers, and architects on
when to use data services and when to adopt the RAD approach to
building SOA data services as the solution standard across all new SOA
projects where data integration is required. Second, we plan to
implement a "data services reuse" metric for measuring success across
future projects to reduce development and maintenance costs. In
addition, we're working with the centralized shared services team to
create a Data Services Center of Excellence that promotes best
practices, optimizes economies of scale, and maximizes reach across
projects. Finally, we'll continue to seek emerging technologies and
agile development practices that accelerate SOA projects and enable us
to move to SOA in a safe and powerful way.
Conclusion
As advances in medical care and the
need for new medicines continue to grow, the need for better ways to
manage and deliver information is growing. In the same spirit that
makes Pfizer a trusted leader in drug discovery and commercialization,
the informatics group is pressing forward to meet the ever-demanding
needs of our internal R&D customers as well.
Successful drug discovery needs data fast. To achieve rapid delivery requires new real-time portals and composite applications that rely heavily on existing data sourced from multiple systems from across the enterprise. Delivering that data to our researchers and managers has been one of our biggest bottlenecks, adding months and cost to our project timelines. These data integration needs, along with our aggressive SOA strategy and RAD objectives, have driven us to find, test, and deploy a new approach to data integration - SOA data services.
Published December 1, 2007 Reads 6,109
Copyright © 2007 SYS-CON Media, Inc. — All Rights Reserved.
Syndicated stories and blog feeds, all rights reserved by the author.
More Stories By Daniel Eng
Daniel Eng has over 17 years of diverse IT experience in managing projects, leading technical teams, and developing enterprise applications within Fortune 100 companies. Currently at Pfizer Global Research and Development, Dan is leading efforts in transitioning business processes and applications into a SOA environment by using emerging technologies and agile management practices. Prior to Pfizer, he was an independent consultant helping his Fortune 500 clients in developing intranet sites, portable applications and e-commerce solutions. Dan has also worked in many e-commerce start-ups and healthcare organizations. He holds a BSEE degree from Polytechnic University and an MBA degree from Gonzaga University.
- The Top 150 Players in Cloud Computing
- Commercial vs Federal Cloud Computing
- Why IBM’s Server Chief Got Busted
- Industry Experts Discuss the State of Cloud Computing
- Cloud Expo New York Call for Papers Deadline December 15
- Cloud Computing on Gartner's Top 10 List and SYS-CON Events' 2010 Calendar
- US Federal Government is Major Cloud Computing Innovator
- Google Wave
- Ulitzer.com Named Exclusive "New Media" Sponsor of Cloud Computing Conference & Expo
- Tactical Cloud Computing Panel at 1st Annual GovIT Expo
- Adaptivity & Cloud Computing: Exclusive Q&A with CEO Tony Bishop
- 4th International Cloud Expo: Photo Album
- The Top 150 Players in Cloud Computing
- SYS-CON.TV: Cloud Computing Expo Power Panel
- Commercial vs Federal Cloud Computing
- Why IBM’s Server Chief Got Busted
- 1st Annual GovIT Expo: Letter from the Technical Chair
- Deputy CIO of the CIA to Keynote 1st Annual GovIT Expo
- Industry Experts Discuss the State of Cloud Computing
- SOA World Power Panel on SYS-CON.TV
- CIA was Headed to an Enterprise Cloud All Along: Jill Tummler Singer
- 1st Annual Government IT Conference & Expo: Themes & Topics
- Cloud Expo New York Call for Papers Deadline December 15
- Stock in Focus: Dragon Capital
- The i-Technology Right Stuff
- Who Are The All-Time Heroes of i-Technology?
- Get the Message
- Where Are RIA Technologies Headed in 2008?
- i-Technology Viewpoint: Is Web 2.0 the Global SOA?
- i-Technology Viewpoint: Thinking Outside the VC Box
- ESB Myth Busters: 10 Enterprise Service Bus Myths Debunked
- i-Technology Viewpoint: When to Leave Your First IT Job
- SOA Web Services Edge Conference Coverage on SYS-CON.TV
- Five Reasons Why Web 2.0 Matters
- SYS-CON.TV's "SOA Web Services" and "Enterprise Open Source" Programs To Air in December
- SOA World Conference & Expo SYS-CON.TV Power Panel Live From Times Square









There are a variety of applications that supp...



























