Welcome!

SOA & WOA Authors: Peter Silva, Maureen O'Gara, Tony Bishop, Mark O'Neill, Yeshim Deniz

Related Topics: SOA & WOA

SOA & WOA: Article

SOA to the Rescue, When Drug Discovery Needs Data Fast!

Information is key to drug discovery

SOA Data Services Approach Selected
After reviewing several new alternative approaches, we identified SOA data services as the best one for meeting our criteria.

Data services are a form of Web Service optimized for real-time data integration. Data services virtualize data to decouple physical and logical locations and therefore avoid unnecessary data replication. Data services abstract complex data structures and syntax. Data services federate disparate data into useful composites. Data services also support data integration across both SOA and non-SOA applications.

Architecturally, data services combine to form a middle layer of reusable services, or a data services layer, decoupled from both the underlying source-data layer as well as the consuming solutions layer. This provides the flexibility required to deal with each layer in the most effective manner, as well as the agility to work quickly across layers such as applications, schemas, or underlying data sources change (see Figure 1).

Beyond providing complex multi-source data integration, data services meet our other criteria as well. Because data services are on-demand, they meet our requirement for real-time information delivery. By not replicating data, data services eliminate the time required for building and testing marts. Further, data services can be automatically generated directly from our data models and so don't require coding. Data services, due to abstraction, can often be reused across projects. Finally, data services, because of their architecture, XML support capabilities, and standards compliance, are inherently SOA-compliant.

Data Services Infrastructure Technology Selected
Once we chose a SOA data services approach, we searched for a data services infrastructure provider that offered development tools and an appropriate run-time environment. We selected Composite Software. With more than 20 projects running in various Pfizer divisions and a Composite Center of Excellence at our headquarters, Composite was a proven vendor at Pfizer and its best-of-breed offerings met our search criteria.

Now our overall data integration capabilities include data virtualization, data abstraction, and data federation across both SOA and non-SOA environments. Delivered via Composite's Information Server, the solution supports both our design and run-time requirements. At build time, we have an easy-to-use data modeler and code generator to abstract our data in the form of relational views for reporting and other uses and/or Web data services for SOA initiatives. Its high-performance query engine securely accesses, federates, and delivers the diverse distributed data to our consuming solutions in real-time.

The Proof Was in the Portal
With our data services strategy and data integration toolset in hand, our next task was to do a pilot project. We wanted to see if we could successfully complete the project, and if we could complete it much faster while complying with SOA principles.

For our pilot, we selected the Drug Discovery Portfolio portal. This project easily met our evaluation criteria.

Business Requirements
Senior management, project team leaders, business analysts, and research scientists across Pfizer's R&D and commercial business units need to continuously evaluate our portfolio of discovery projects and drugs in development. This analysis includes how these projects fit into Pfizer's overall strategic portfolio as well as how each will be impacted by costs, market conditions and available resources. A complete picture of each particular project, as well as an overview of all the projects, is needed for major business decisions to be based on all relevant factors. Real-time access to this information is critical, so Pfizer can rapidly react to unforeseen events intelligently.

User Interface Requirements
We selected a Web portal as the user interface because this provides the most flexible and accessible solution for our wide range of information users. This means existing data has to be delivered in the form of Web data services for our portal developers and our portal toolset to consume easily.

Data Integration Requirements
Key data to be delivered includes both key metrics and details such as project costs, resources, timelines and ROI calculations, to name a few. This diverse data needs to be integrated from a wide variety of source applications from across various Pfizer groups. This diversity of source system data structures enabled us to evaluate and thoroughly test Composite's data connector and transform capabilities during the pilot project. We also thoroughly tested Composite's high-performance query algorithms through the dynamic nature of the sources and the need for real-time delivery. Because many teams from across the globe needed to be involved to provide access to the right data, we added ease-of-use to our RAD evaluation criteria.

Pilot Benchmark: The Data Mart Approach
To compare the relative and absolute strengths and weaknesses of the new data services approach and the Information Server versus our traditional approach, we invested in a small benchmark of the "old way." Benchmarking the functional and technical specifications lets us compare end solution delivery. Benchmarking the development process lets us compare time-to-solution and development costs.

Functional and Technical Specification
We already knew we could use our ETL/data mart tools to successfully combine the data required into a mart. Unfortunately, putting the relational data into a mart was only half the job. We still needed to get this data out of the mart and into the portal in the form of a Web Service. We found this requires manual coding and an additional toolset. What's more, to achieve the real-time delivery requirement, we found we needed to achieve unrealistic refresh rates using highly complex change data capture techniques.

Development Process
In a side-by-side comparison, Table 1 represents the steps used in an ETL versus a data services approach.

Problems with the Data Mart Approach
The ETL/data mart approach was not ideal for this project for the following reasons:
  •   We could only come close to meeting the real-time integration requirements if we used advanced change data capture and frequent refresh features.
  •   We found that the data mart was physically instantiated in a relational form. Yet, our portal developers wanted the data in the form of WSDL Web Services that are easier for the portal to consume.
  •   Sequential development such as building the ETL scripts, the mart, the delivery scripts, and then the portal application stretched the elapsed time thereby pushing out business benefits and adding costs.
  •   ETL and Web Service scripting were slow manual development processes.
  •   Scheduling the setup of the data mart infrastructure required coordinating with our operations group, fitting into its schedule and backlog.
  •   Replicated data in the mart would need to be maintained and controlled in addition to the original source data.
  •   Data security requires additional manual coding.
  •   Any changes required ETL scripts to be changed, as well as the mart to be reloaded, slowing our response to new requirements or even simple bug fixes.
  •   More data structure and syntax expertise was required by developers throughout the process, not just basic SQL.

SOA Data Services Approach Pilot Meets the Spec, Is Faster, and More
The data services approach proved ideal for our Drug Discovery Portfolio Portal project.
  •   We completed our project in less than half the time of traditional development. Much of the data-level development was automated, freeing our skilled development team to work on application-level development.
  •   Fewer skills were needed due to the drag-and-drop data service development environment, built-in security, and automated generation of Web data services.
  •   SOA-compliant WSDL data services provided data in the form the portal developers needed.
  •   Loosely coupled data services were easier to maintain than ETL scripts in case of changes either to the underlying data sources or the portal.
  •   Data service assets built for the portal project can be reused by other development projects.
  •   We no longer needed our IT operations team to build and maintain the data mart infrastructure. No extra costs for the mart itself.

Pfizer Informatics Adopts Data Services Approach
Going forward, we plan to use the data services approach and tools for all projects requiring complex data integration across multiple heterogeneous sources because the data services approach reduces unnecessary data replication and provides real-time information delivery, rapid application development, and SOA compliance.

We learned a number of lessons applicable to future projects. Data integration doesn't have to be hard or time-consuming with the right approach and right supporting tools. Virtualizing data versus replicating saves time and money. Rapid prototyping is possible, even automatic, when the right tools are used. Agility and reuse, the promise of SOA, comes to life in loosely coupled data services that span the gap between source data and end applications.

Moving from Pilot to Enterprise, Funded by Time and Cost Savings
With the new SOA data services approach to data integration proven, we have now put together our roadmap for future adoption. This roadmap includes educating our business analysts, developers, and architects on when to use data services and when to adopt the RAD approach to building SOA data services as the solution standard across all new SOA projects where data integration is required. Second, we plan to implement a "data services reuse" metric for measuring success across future projects to reduce development and maintenance costs. In addition, we're working with the centralized shared services team to create a Data Services Center of Excellence that promotes best practices, optimizes economies of scale, and maximizes reach across projects. Finally, we'll continue to seek emerging technologies and agile development practices that accelerate SOA projects and enable us to move to SOA in a safe and powerful way.

Conclusion
As advances in medical care and the need for new medicines continue to grow, the need for better ways to manage and deliver information is growing. In the same spirit that makes Pfizer a trusted leader in drug discovery and commercialization, the informatics group is pressing forward to meet the ever-demanding needs of our internal R&D customers as well.

Successful drug discovery needs data fast. To achieve rapid delivery requires new real-time portals and composite applications that rely heavily on existing data sourced from multiple systems from across the enterprise. Delivering that data to our researchers and managers has been one of our biggest bottlenecks, adding months and cost to our project timelines. These data integration needs, along with our aggressive SOA strategy and RAD objectives, have driven us to find, test, and deploy a new approach to data integration - SOA data services.

More Stories By Daniel Eng

Daniel Eng has over 17 years of diverse IT experience in managing projects, leading technical teams, and developing enterprise applications within Fortune 100 companies. Currently at Pfizer Global Research and Development, Dan is leading efforts in transitioning business processes and applications into a SOA environment by using emerging technologies and agile management practices. Prior to Pfizer, he was an independent consultant helping his Fortune 500 clients in developing intranet sites, portable applications and e-commerce solutions. Dan has also worked in many e-commerce start-ups and healthcare organizations. He holds a BSEE degree from Polytechnic University and an MBA degree from Gonzaga University.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.