Welcome!

Microservices Expo Authors: Elizabeth White, Jyoti Bansal, Pat Romanski, AppNeta Blog, Liz McMillan

Related Topics: Containers Expo Blog, Microservices Expo, @CloudExpo

Containers Expo Blog: Article

Data Mining and Data Virtualization

Extending Data Virtualization Platforms

Data Mining helps organizations to discover new insights from existing data, so that predictive techniques can be applied towards various business needs. The following are the typical characteristics of data mining.

  • Extends Business Intelligence, beyond Query, Reporting and OLAP (Online Analytical Processing)
  • Data Mining is cornerstone for assessing the customer risk, market segmentation and prediction
  • Data Mining is about performing computationally complex analysis techniques on very large volumes of data
  • It combines the analysis of historical data with modeling techniques towards future predictions, it turns Operations into performance

The following are the use cases that can benefit from the application of data mining:

  • Manufacturing / Product Development: Understanding the defect and customer complaints into a model that can provide insight into customer satisfaction and help enterprises build better products
  • Consumer Payments: Understand the payment patterns of consumers to predict market penetration analysis and discount guidelines.
  • Consumer Industry: Customer segmentation to understand the customer base and help targeted advertisements and promotions.
  • Consumer Industry: Campaign effectiveness can be gauged with customer segmentation coupled with predictive marketing models.
  • Retail Indsutry: Supply chain efficiencies can be brought by mining the supply demand data

‘In Database' Data Mining
Data Mining is typically a multi-step process.

  1. Define the Business Issue to Be Addressed, e.g., Customer Attrition, Fraud Detection, Cross Selling.
  2. Identify the Data Model / Define the Data / Source the Data.(Data Sources, Data Types, Data Usage etc.)
  3. Choose the Mining Technique (Discovery Data Mining, Predictive Data Mining, Clustering, Link Analysis, Classification, Value Prediction)
  4. Interpret the Results (Visualization Techniques)
  5. Deploy the Results (CRM Systems.)

Initially Data Mining has been implemented with a combination of multiple tools and systems, which resulted in latency and a long cycle for realization of results.

Sensing this issue, major RDBMS vendors have implemented Data Mining as part of their core database offering. This offering has the following key features:

  • Data Mining engine resides inside the traditional database environment facilitating easier licensing and packaging options
  • Eliminates the data extraction and data movement and avoids costly ETL process
  • Major Data Mining models are available as pre-built SQL functions which can be easily integrated into the existing database development process.

The following is some of the information about data mining features as part of the popular databases:

Built as DB2 data mining functions, the Modeling and Scoring services directly integrate data mining technology into DB2. This leads to faster application performance. Developers want integration and performance, as well as any facility to make their job easier. The model can be used within any SQL statement. This means the scoring function can be invoked with ease from any application that is SQL aware, either in batch, real time, or as a trigger.

Oracle Data Mining, a component of the Oracle Advanced Analytics Option, delivers a wide range of cutting edge machine learning algorithms inside the Oracle Database. Since Oracle Data Mining functions reside natively in the Oracle Database kernel, they deliver unparallel performance, scalability and security. The data and data mining functions never leave the database to deliver a comprehensive in-database processing solution.

Data Virtualization: Data Virtualization is the new concept that allows , enterprises to access their information contained in disparate data sources in a seamless way. As mentioned in my earlier articles there are specialized Data virtualization platforms from vendors like, Composite Software, Denodo Technologies, IBM, Informatica, Microsoft have developed specialized data virtualization engines. My earlier article details out Data Virtualization using Middleware Vs RDBMS.

Data virtualization solutions provide a virtualized data services layer that integrates data from heterogeneous data sources and content in real time, near-real time, or batch as needed to support a wide range of applications and processes. : The Forrester Wave: Data Virtualization, Q1 2012 puts the data virtualization in the following perspective, in the past 24 months, we have seen a significant increase in adoption in the healthcare, insurance, retail, manufacturing, eCommerce, and media/entertainment sectors. Regardless of industry, all firms can benefit from data virtualization.

Data Mining Inside Data Virtualization Platforms?
The increase in data sources, especially integration with Big Data and Unstructured data made Data Virtualization platform a important part of enterprise data access strategy. Data virtualization provides the following attributes for efficient data access across enterprise.

  • Abstraction: Provides location, API, language and storage technology independent access of data
  • Federation: Converges data from multiple disparate data sources
  • Transformation: Enriches the quality and quantity of data on a need basis
  • On-Demand Delivery: Provides the consuming applications the required information on-demand

With the above benefits of the Data Virtualization Platform in mind, it is evident that enterprises will find it more useful if Data Virtualization platforms are built with Data Mining Models and Algorithms, so that effective Data Mining can be performed on top of Data Virtualization platform.

As the important part of Data Mining is about identifying the correct data sources and associated events of interest, effective Data Mining can be built if disparate data sources are brought under the scope of Data Virtualization Platform rather than putting the Data Mining inside a single database engine.

The following extended view of Data Virtualization Platform signifies how Data Mining can be part of Data Virtualization Platform.

Summary
Data Virtualization is becoming part of the mainstream enterprise data access strategy, mainly because it abstracts the multiple data sources and avoids complex ETL processing and facilitates the single version of truth, data quality and zero latency enterprise.

If value adds like a Data Mining engine can be built on top of the existing Data Virtualization platform, the enterprises will benefit further.

More Stories By Srinivasan Sundara Rajan

Highly passionate about utilizing Digital Technologies to enable next generation enterprise. Believes in enterprise transformation through the Natives (Cloud Native & Mobile Native).

@MicroservicesExpo Stories
DevOps is often described as a combination of technology and culture. Without both, DevOps isn't complete. However, applying the culture to outdated technology is a recipe for disaster; as response times grow and connections between teams are delayed by technology, the culture will die. A Nutanix Enterprise Cloud has many benefits that provide the needed base for a true DevOps paradigm. In his Day 3 Keynote at 20th Cloud Expo, Chris Brown, a Solutions Marketing Manager at Nutanix, will explore t...
Software development is a moving target. You have to keep your eye on trends in the tech space that haven’t even happened yet just to stay current. Consider what’s happened with augmented reality (AR) in this year alone. If you said you were working on an AR app in 2015, you might have gotten a lot of blank stares or jokes about Google Glass. Then Pokémon GO happened. Like AR, the trends listed below have been building steam for some time, but they’ll be taking off in surprising new directions b...
Everyone wants to use containers, but monitoring containers is hard. New ephemeral architecture introduces new challenges in how monitoring tools need to monitor and visualize containers, so your team can make sense of everything. In his session at @DevOpsSummit, David Gildeh, co-founder and CEO of Outlyer, will go through the challenges and show there is light at the end of the tunnel if you use the right tools and understand what you need to be monitoring to successfully use containers in your...
What if you could build a web application that could support true web-scale traffic without having to ever provision or manage a single server? Sounds magical, and it is! In his session at 20th Cloud Expo, Chris Munns, Senior Developer Advocate for Serverless Applications at Amazon Web Services, will show how to build a serverless website that scales automatically using services like AWS Lambda, Amazon API Gateway, and Amazon S3. We will review several frameworks that can help you build serverle...
@DevOpsSummit has been named the ‘Top DevOps Influencer' by iTrend. iTrend processes millions of conversations, tweets, interactions, news articles, press releases, blog posts - and extract meaning form them and analyzes mobile and desktop software platforms used to communicate, various metadata (such as geo location), and automation tools. In overall placement, @DevOpsSummit ranked as the number one ‘DevOps Influencer' followed by @CloudExpo at third, and @MicroservicesE at 24th.
The IT industry is undergoing a significant evolution to keep up with cloud application demand. We see this happening as a mindset shift, from traditional IT teams to more well-rounded, cloud-focused job roles. The IT industry has become so cloud-minded that Gartner predicts that by 2020, this cloud shift will impact more than $1 trillion of global IT spending. This shift, however, has left some IT professionals feeling a little anxious about what lies ahead. The good news is that cloud computin...
SYS-CON Events announced today that HTBase will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. HTBase (Gartner 2016 Cool Vendor) delivers a Composable IT infrastructure solution architected for agility and increased efficiency. It turns compute, storage, and fabric into fluid pools of resources that are easily composed and re-composed to meet each application’s needs. With HTBase, companies can quickly prov...
Culture is the most important ingredient of DevOps. The challenge for most organizations is defining and communicating a vision of beneficial DevOps culture for their organizations, and then facilitating the changes needed to achieve that. Often this comes down to an ability to provide true leadership. As a CIO, are your direct reports IT managers or are they IT leaders? The hard truth is that many IT managers have risen through the ranks based on their technical skills, not their leadership abi...
The essence of cloud computing is that all consumable IT resources are delivered as services. In his session at 15th Cloud Expo, Yung Chou, Technology Evangelist at Microsoft, demonstrated the concepts and implementations of two important cloud computing deliveries: Infrastructure as a Service (IaaS) and Platform as a Service (PaaS). He discussed from business and technical viewpoints what exactly they are, why we care, how they are different and in what ways, and the strategies for IT to transi...
After more than five years of DevOps, definitions are evolving, boundaries are expanding, ‘unicorns’ are no longer rare, enterprises are on board, and pundits are moving on. Can we now look at an evolution of DevOps? Should we? Is the foundation of DevOps ‘done’, or is there still too much left to do? What is mature, and what is still missing? What does the next 5 years of DevOps look like? In this Power Panel at DevOps Summit, moderated by DevOps Summit Conference Chair Andi Mann, panelists l...
Thanks to Docker and the DevOps revolution, microservices have emerged as the new way to build and deploy applications — and there are plenty of great reasons to embrace the microservices trend. If you are going to adopt microservices, you also have to understand that microservice architectures have many moving parts. When it comes to incident management, this presents an important difference between microservices and monolithic architectures. More moving parts mean more complexity to monitor an...
All organizations that did not originate this moment have a pre-existing culture as well as legacy technology and processes that can be more or less amenable to DevOps implementation. That organizational culture is influenced by the personalities and management styles of Executive Management, the wider culture in which the organization is situated, and the personalities of key team members at all levels of the organization. This culture and entrenched interests usually throw a wrench in the work...
DevOps is often described as a combination of technology and culture. Without both, DevOps isn't complete. However, applying the culture to outdated technology is a recipe for disaster; as response times grow and connections between teams are delayed by technology, the culture will die. A Nutanix Enterprise Cloud has many benefits that provide the needed base for a true DevOps paradigm.
Microservices (μServices) are a fascinating evolution of the Distributed Object Computing (DOC) paradigm. Initial design of DOC attempted to solve the problem of simplifying developing complex distributed applications by applying object-oriented design principles to disparate components operating across networked infrastructure. In this model, DOC “hid” the complexity of making this work from the developer regardless of the deployment architecture through the use of complex frameworks, such as C...
TechTarget storage websites are the best online information resource for news, tips and expert advice for the storage, backup and disaster recovery markets. By creating abundant, high-quality editorial content across more than 140 highly targeted technology-specific websites, TechTarget attracts and nurtures communities of technology buyers researching their companies' information technology needs. By understanding these buyers' content consumption behaviors, TechTarget creates the purchase inte...
We've all had that feeling before: The feeling that you're missing something that everyone else is in on. For today's IT leaders, that feeling might come up when you hear talk about cloud brokers. Meanwhile, you head back into your office and deal with your ever-growing shadow IT problem. But the cloud-broker whispers and your shadow IT issues are linked. If you're wondering "what the heck is a cloud broker?" we've got you covered.
In today's enterprise, digital transformation represents organizational change even more so than technology change, as customer preferences and behavior drive end-to-end transformation across lines of business as well as IT. To capitalize on the ubiquitous disruption driving this transformation, companies must be able to innovate at an increasingly rapid pace. Traditional approaches for driving innovation are now woefully inadequate for keeping up with the breadth of disruption and change facing...
In his General Session at 16th Cloud Expo, David Shacochis, host of The Hybrid IT Files podcast and Vice President at CenturyLink, investigated three key trends of the “gigabit economy" though the story of a Fortune 500 communications company in transformation. Narrating how multi-modal hybrid IT, service automation, and agile delivery all intersect, he will cover the role of storytelling and empathy in achieving strategic alignment between the enterprise and its information technology.
Microservices are a very exciting architectural approach that many organizations are looking to as a way to accelerate innovation. Microservices promise to allow teams to move away from monolithic "ball of mud" systems, but the reality is that, in the vast majority of organizations, different projects and technologies will continue to be developed at different speeds. How to handle the dependencies between these disparate systems with different iteration cycles? Consider the "canoncial problem" ...
The rise of containers and microservices has skyrocketed the rate at which new applications are moved into production environments today. While developers have been deploying containers to speed up the development processes for some time, there still remain challenges with running microservices efficiently. Most existing IT monitoring tools don’t actually maintain visibility into the containers that make up microservices. As those container applications move into production, some IT operations t...