| By Srinivasan Sundara Rajan | Article Rating: |
|
| April 13, 2012 06:45 AM EDT | Reads: |
3,950 |
Data Mining helps organizations to discover new insights from existing data, so that predictive techniques can be applied towards various business needs. The following are the typical characteristics of data mining.
- Extends Business Intelligence, beyond Query, Reporting and OLAP (Online Analytical Processing)
- Data Mining is cornerstone for assessing the customer risk, market segmentation and prediction
- Data Mining is about performing computationally complex analysis techniques on very large volumes of data
- It combines the analysis of historical data with modeling techniques towards future predictions, it turns Operations into performance

The following are the use cases that can benefit from the application of data mining:
- Manufacturing / Product Development: Understanding the defect and customer complaints into a model that can provide insight into customer satisfaction and help enterprises build better products
- Consumer Payments: Understand the payment patterns of consumers to predict market penetration analysis and discount guidelines.
- Consumer Industry: Customer segmentation to understand the customer base and help targeted advertisements and promotions.
- Consumer Industry: Campaign effectiveness can be gauged with customer segmentation coupled with predictive marketing models.
- Retail Indsutry: Supply chain efficiencies can be brought by mining the supply demand data
‘In Database' Data Mining
Data Mining is typically a multi-step process.
- Define the Business Issue to Be Addressed, e.g., Customer Attrition, Fraud Detection, Cross Selling.
- Identify the Data Model / Define the Data / Source the Data.(Data Sources, Data Types, Data Usage etc.)
- Choose the Mining Technique (Discovery Data Mining, Predictive Data Mining, Clustering, Link Analysis, Classification, Value Prediction)
- Interpret the Results (Visualization Techniques)
- Deploy the Results (CRM Systems.)
Initially Data Mining has been implemented with a combination of multiple tools and systems, which resulted in latency and a long cycle for realization of results.
Sensing this issue, major RDBMS vendors have implemented Data Mining as part of their core database offering. This offering has the following key features:
- Data Mining engine resides inside the traditional database environment facilitating easier licensing and packaging options
- Eliminates the data extraction and data movement and avoids costly ETL process
- Major Data Mining models are available as pre-built SQL functions which can be easily integrated into the existing database development process.
The following is some of the information about data mining features as part of the popular databases:
Built as DB2 data mining functions, the Modeling and Scoring services directly integrate data mining technology into DB2. This leads to faster application performance. Developers want integration and performance, as well as any facility to make their job easier. The model can be used within any SQL statement. This means the scoring function can be invoked with ease from any application that is SQL aware, either in batch, real time, or as a trigger.
Oracle Data Mining, a component of the Oracle Advanced Analytics Option, delivers a wide range of cutting edge machine learning algorithms inside the Oracle Database. Since Oracle Data Mining functions reside natively in the Oracle Database kernel, they deliver unparallel performance, scalability and security. The data and data mining functions never leave the database to deliver a comprehensive in-database processing solution.
Data Virtualization: Data Virtualization is the new concept that allows , enterprises to access their information contained in disparate data sources in a seamless way. As mentioned in my earlier articles there are specialized Data virtualization platforms from vendors like, Composite Software, Denodo Technologies, IBM, Informatica, Microsoft have developed specialized data virtualization engines. My earlier article details out Data Virtualization using Middleware Vs RDBMS.
Data virtualization solutions provide a virtualized data services layer that integrates data from heterogeneous data sources and content in real time, near-real time, or batch as needed to support a wide range of applications and processes. : The Forrester Wave: Data Virtualization, Q1 2012 puts the data virtualization in the following perspective, in the past 24 months, we have seen a significant increase in adoption in the healthcare, insurance, retail, manufacturing, eCommerce, and media/entertainment sectors. Regardless of industry, all firms can benefit from data virtualization.
Data Mining Inside Data Virtualization Platforms?
The increase in data sources, especially integration with Big Data and Unstructured data made Data Virtualization platform a important part of enterprise data access strategy. Data virtualization provides the following attributes for efficient data access across enterprise.
- Abstraction: Provides location, API, language and storage technology independent access of data
- Federation: Converges data from multiple disparate data sources
- Transformation: Enriches the quality and quantity of data on a need basis
- On-Demand Delivery: Provides the consuming applications the required information on-demand
With the above benefits of the Data Virtualization Platform in mind, it is evident that enterprises will find it more useful if Data Virtualization platforms are built with Data Mining Models and Algorithms, so that effective Data Mining can be performed on top of Data Virtualization platform.
As the important part of Data Mining is about identifying the correct data sources and associated events of interest, effective Data Mining can be built if disparate data sources are brought under the scope of Data Virtualization Platform rather than putting the Data Mining inside a single database engine.
The following extended view of Data Virtualization Platform signifies how Data Mining can be part of Data Virtualization Platform.

Summary
Data Virtualization is becoming part of the mainstream enterprise data access strategy, mainly because it abstracts the multiple data sources and avoids complex ETL processing and facilitates the single version of truth, data quality and zero latency enterprise.
If value adds like a Data Mining engine can be built on top of the existing Data Virtualization platform, the enterprises will benefit further.
Published April 13, 2012 Reads 3,950
Copyright © 2012 SYS-CON Media, Inc. — All Rights Reserved.
Syndicated stories and blog feeds, all rights reserved by the author.
More Stories By Srinivasan Sundara Rajan
Srinivasan Sundara Rajan (Also Known As Sundar) Is A Enterprise Technology Enabler for realizing business capabilities. His primary focus is enabling Agile Enterprises by facilitating the adoption of Every Thing As A Service Model with particular concentration on BpaaS (Business Process As A Service). He also helps enterprises in getting meaningful insights from their structured and unstructured and real time data sources. All the views expressed are Srinivasan's independent analysis of industry and solutions and need not necessarily be of his current or past organizations. Srinivasan would like to thank every one who augmented his Architectural skills with Analytical ideas.
- Cloud People: A Who's Who of Cloud Computing
- Cloud Expo New York Speaker Profile: Dave Linthicum – Cloud Technology Partners
- Cloud Expo New York: Cloud Is Changing the Economics of Business
- Best CIO Practices Shared from SHI’s Customers
- Cloud Expo New York: Deploying Hybrid Cloud for Performance and Uptime
- Cloud Expo New York: Delivering Digital Marketing on the Cloud
- Big Data Isn’t About the Database, It’s About the Application
- Cloud Expo New York: Rethink IT and Reinvent Business with IBM SmartCloud
- Cloudant to Exhibit at Cloud Expo & Big Data Expo New York
- BEA Updates WebLogic SOA Portal for Web 2.0 Era
- How to Move Your Oracle Databases to Amazon EC2 Cloud
- The Accessibility of the Cloud
- Cloud People: A Who's Who of Cloud Computing
- Cloud Expo New York: Best CIO Practices Shared from SHI’s Customers
- Cloud Expo New York Speaker Profile: Dave Linthicum – Cloud Technology Partners
- Cloud Expo New York: Cloud Is Changing the Economics of Business
- Cloud Expo New York: How to Use Google Apps Script
- Cloud Computing Bootcamp at Cloud Expo New York
- Rackspace Hosting Named “Platinum Plus Sponsor” of Cloud Expo New York
- Best CIO Practices Shared from SHI’s Customers
- Cloud Expo New York: Why Big Data Is Really About Small Data
- Cloud Expo New York: Deploying Hybrid Cloud for Performance and Uptime
- Cloud Expo New York: Delivering Digital Marketing on the Cloud
- Small Cancers, Big Data, and a Life Examined
- The i-Technology Right Stuff
- The Top 150 Players in Cloud Computing
- Who Are The All-Time Heroes of i-Technology?
- Where Are RIA Technologies Headed in 2008?
- Get the Message
- i-Technology Viewpoint: Is Web 2.0 the Global SOA?
- ESB Myth Busters: 10 Enterprise Service Bus Myths Debunked
- i-Technology Viewpoint: Thinking Outside the VC Box
- i-Technology Viewpoint: When to Leave Your First IT Job
- SOA Web Services Edge Conference Coverage on SYS-CON.TV
- SYS-CON.TV's "SOA Web Services" and "Enterprise Open Source" Programs To Air in December
- Five Reasons Why Web 2.0 Matters

















