Click here to close now.

Welcome!

SOA & WOA Authors: David Sprott, Jason Bloomberg, Pat Romanski, AppDynamics Blog, Carmen Gonzalez

Related Topics: Open Source, SOA & WOA

Open Source: Article

Case Study: Open Source + Business Intelligence

A marriage made for data-driven businesses

Data-driven businesses are facing some tough challenges in today's rapidly changing information landscape. As decision cycles continue to shrink, companies need to act on information within hours and minutes rather than weeks and days. At the same time, the volume of data that needs to be analyzed is growing exponentially. Business intelligence (BI) approaches that might have made sense a decade or even five years ago may no longer be the best fit for organizations that must quickly and affordably make sense of terabytes of incoming data that shows no sign of slowing down.

For my company, MX Force, speedy data analysis is not simply a "nice to have," it's critical to our business. As a cloud-based provider of email security for organizations of all sizes, we need to identify the origins of spam, viruses and other potential threats for our clients, fast. But as our business has grown, so has the volume of email log data that we must store, filter, search, analyze and report on. Recently, we were challenged to find a database that could reliably enable quick and efficient ad-hoc queries on up to a year's worth of email log data. Our staff uses this data to analyze and report on statistical information, and we also give our clients the ability to query their own logs to diagnose mail delivery issues. It was important to find a database that could deliver the high performance we required, but affordability and ease of administration were also of vital concern. These considerations prompted us to seek an open source solution.

Open Source Meets Business Intelligence
MX Force uses a number of open source tools within our organization. The low cost of open source is one reason for this, but flexibility is another important driver. Because open source projects are community-driven, users can tweak, customize and tinker with the software as much as they like. This is a big advantage when it comes to business intelligence, as data analysis requirements can change quickly, and you don't want to have to wait weeks or months to get a new query set up or to change the parameters of those that are already running. MX Force was already using MySQL in our business, so we decided to try Infobright's open source analytic database, ICE (Infobright Community Edition.) ICE combines a columnar database with innovative compression and self-tuning capabilities that eliminate the need to create indexes, partition data or do any manual intervention to achieve fast response for queries and reports. The software is built on MySQL, so for us there was a very small implementation and training curve - ICE uses the same familiar MySQL interface. The fact that ICE is an open source analytic solution presented us with several key benefits:

  1. Deployment speed: The time from download and installation to first production use was just three weeks.
  2. Affordability: Many of the proprietary commercial BI solutions available today require custom configuration, expensive licensing agreements and equally expensive hardware to support and run it. Not only was ICE free to install, we could also run the software on inexpensive commodity servers, eliminating the need to invest in high performance servers and storage arrays. (Our entire workload is supported by a single quad-core server.)
  3. Simplicity and flexibility: Because ICE is open and standards-based, we can quickly make changes as needed without requiring extensive IT assistance. In addition, it's often a lot simpler to make fixes or upgrade an open source solution because an entire community contributes their expertise to fixing bugs and making improvements. With proprietary software, users have to wait for issues to be addressed by the vendor, which can take much longer.

MX Force is currently using ICE to quickly isolate mail flow problems and trends. In our experience, using a free, open source product has not in any way involved a compromise on performance or capabilities. We are achieving 10:1 data compression, which saves on storage costs and boosts performance. Most statistical queries render results in less than five seconds. Ongoing administration is simple. The net result is that the product delivers the fast query performance and reporting functionality we needed, at an incredibly low cost for hardware and ongoing maintenance.

Look, then Leap
Interested in giving open source a try for your BI and analytic efforts? There are a number of compelling benefits to doing so, but as with any type of software, it's also important to look before you leap. Evaluation and testing considerations are no different than they would be for licensed software - you want to make sure the solution has the features and capabilities most relevant to your business. Also, there's a difference between open source projects that are at a very early and experimental stage and software that is well established and has a vibrant and involved community behind it, strong vendor support, or both. Investigate the support offered for the solution under consideration. How often are new features added? Are bug fixes made in a timely manner? Is there useful and accurate supporting documentation?

With ICE, we were certainly attracted by the many resources and significant participation of both Infobright and the user community. We also knew there was a commercial version available if we decided we needed the additional functionality it offered or a formal support contract. For companies just jumping in to the open source arena, it's best to avoid tools that haven't yet cultivated a strong following. But even if you do make a mistake, the low (and usually free) cost of open source means that there's minimal risk.

The BI requirements of today's data-driven businesses demand speed, simplicity and affordability. As open source solutions continue to mature, it's worth looking at projects that are focused on analytics, BI and other data management activities. The more nimble and flexible approach embodied by open source may just be the best fit for addressing the many information management challenges driven by data growth and complexity.

More Stories By Mike Makowski

Mike Makowski is CTO of MX Force, a leading provider of email security in the cloud and member of Infobright’s Customer Advisory Council. More information about MX Force can be found at http://www.mxforce.com/

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


@ThingsExpo Stories
Analytics is the foundation of smart data and now, with the ability to run Hadoop directly on smart storage systems like Cloudian HyperStore, enterprises will gain huge business advantages in terms of scalability, efficiency and cost savings as they move closer to realizing the potential of the Internet of Things. In his session at 16th Cloud Expo, Paul Turner, technology evangelist and CMO at Cloudian, Inc., will discuss the revolutionary notion that the storage world is transitioning from mere Big Data to smart data. He will argue that today’s hybrid cloud storage solutions, with commodity...
Cloud data governance was previously an avoided function when cloud deployments were relatively small. With the rapid adoption in public cloud – both rogue and sanctioned, it’s not uncommon to find regulated data dumped into public cloud and unprotected. This is why enterprises and cloud providers alike need to embrace a cloud data governance function and map policies, processes and technology controls accordingly. In her session at 15th Cloud Expo, Evelyn de Souza, Data Privacy and Compliance Strategy Leader at Cisco Systems, will focus on how to set up a cloud data governance program and s...
Roberto Medrano, Executive Vice President at SOA Software, had reached 30,000 page views on his home page - http://RobertoMedrano.SYS-CON.com/ - on the SYS-CON family of online magazines, which includes Cloud Computing Journal, Internet of Things Journal, Big Data Journal, and SOA World Magazine. He is a recognized executive in the information technology fields of SOA, internet security, governance, and compliance. He has extensive experience with both start-ups and large companies, having been involved at the beginning of four IT industries: EDA, Open Systems, Computer Security and now SOA.
Every innovation or invention was originally a daydream. You like to imagine a “what-if” scenario. And with all the attention being paid to the so-called Internet of Things (IoT) you don’t have to stretch the imagination too much to see how this may impact commercial and homeowners insurance. We’re beyond the point of accepting this as a leap of faith. The groundwork is laid. Now it’s just a matter of time. We can thank the inventors of smart thermostats for developing a practical business application that everyone can relate to. Gone are the salad days of smart home apps, the early chalkb...
The industrial software market has treated data with the mentality of “collect everything now, worry about how to use it later.” We now find ourselves buried in data, with the pervasive connectivity of the (Industrial) Internet of Things only piling on more numbers. There’s too much data and not enough information. In his session at @ThingsExpo, Bob Gates, Global Marketing Director, GE’s Intelligent Platforms business, to discuss how realizing the power of IoT, software developers are now focused on understanding how industrial data can create intelligence for industrial operations. Imagine ...
We certainly live in interesting technological times. And no more interesting than the current competing IoT standards for connectivity. Various standards bodies, approaches, and ecosystems are vying for mindshare and positioning for a competitive edge. It is clear that when the dust settles, we will have new protocols, evolved protocols, that will change the way we interact with devices and infrastructure. We will also have evolved web protocols, like HTTP/2, that will be changing the very core of our infrastructures. At the same time, we have old approaches made new again like micro-services...
Operational Hadoop and the Lambda Architecture for Streaming Data Apache Hadoop is emerging as a distributed platform for handling large and fast incoming streams of data. Predictive maintenance, supply chain optimization, and Internet-of-Things analysis are examples where Hadoop provides the scalable storage, processing, and analytics platform to gain meaningful insights from granular data that is typically only valuable from a large-scale, aggregate view. One architecture useful for capturing and analyzing streaming data is the Lambda Architecture, representing a model of how to analyze rea...
SYS-CON Events announced today that Vitria Technology, Inc. will exhibit at SYS-CON’s @ThingsExpo, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. Vitria will showcase the company’s new IoT Analytics Platform through live demonstrations at booth #330. Vitria’s IoT Analytics Platform, fully integrated and powered by an operational intelligence engine, enables customers to rapidly build and operationalize advanced analytics to deliver timely business outcomes for use cases across the industrial, enterprise, and consumer segments.
Today’s enterprise is being driven by disruptive competitive and human capital requirements to provide enterprise application access through not only desktops, but also mobile devices. To retrofit existing programs across all these devices using traditional programming methods is very costly and time consuming – often prohibitively so. In his session at @ThingsExpo, Jesse Shiah, CEO, President, and Co-Founder of AgilePoint Inc., discussed how you can create applications that run on all mobile devices as well as laptops and desktops using a visual drag-and-drop application – and eForms-buildi...
Docker is an excellent platform for organizations interested in running microservices. It offers portability and consistency between development and production environments, quick provisioning times, and a simple way to isolate services. In his session at DevOps Summit at 16th Cloud Expo, Shannon Williams, co-founder of Rancher Labs, will walk through these and other benefits of using Docker to run microservices, and provide an overview of RancherOS, a minimalist distribution of Linux designed expressly to run Docker. He will also discuss Rancher, an orchestration and service discovery platf...
Containers and microservices have become topics of intense interest throughout the cloud developer and enterprise IT communities. Accordingly, attendees at the upcoming 16th Cloud Expo at the Javits Center in New York June 9-11 will find fresh new content in a new track called PaaS | Containers & Microservices Containers are not being considered for the first time by the cloud community, but a current era of re-consideration has pushed them to the top of the cloud agenda. With the launch of Docker's initial release in March of 2013, interest was revved up several notches. Then late last...
SYS-CON Events announced today that Dyn, the worldwide leader in Internet Performance, will exhibit at SYS-CON's 16th International Cloud Expo®, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. Dyn is a cloud-based Internet Performance company. Dyn helps companies monitor, control, and optimize online infrastructure for an exceptional end-user experience. Through a world-class network and unrivaled, objective intelligence into Internet conditions, Dyn ensures traffic gets delivered faster, safer, and more reliably than ever.
CommVault has announced that top industry technology visionaries have joined its leadership team. The addition of leaders from companies such as Oracle, SAP, Microsoft, Cisco, PwC and EMC signals the continuation of CommVault Next, the company's business transformation for sales, go-to-market strategies, pricing and packaging and technology innovation. The company also announced that it had realigned its structure to create business units to more directly match how customers evaluate, deploy, operate, and purchase technology.
In their session at @ThingsExpo, Shyam Varan Nath, Principal Architect at GE, and Ibrahim Gokcen, who leads GE's advanced IoT analytics, focused on the Internet of Things / Industrial Internet and how to make it operational for business end-users. Learn about the challenges posed by machine and sensor data and how to marry it with enterprise data. They also discussed the tips and tricks to provide the Industrial Internet as an end-user consumable service using Big Data Analytics and Industrial Cloud.
Performance is the intersection of power, agility, control, and choice. If you value performance, and more specifically consistent performance, you need to look beyond simple virtualized compute. Many factors need to be considered to create a truly performant environment. In his General Session at 15th Cloud Expo, Harold Hannon, Sr. Software Architect at SoftLayer, discussed how to take advantage of a multitude of compute options and platform features to make cloud the cornerstone of your online presence.
The explosion of connected devices / sensors is creating an ever-expanding set of new and valuable data. In parallel the emerging capability of Big Data technologies to store, access, analyze, and react to this data is producing changes in business models under the umbrella of the Internet of Things (IoT). In particular within the Insurance industry, IoT appears positioned to enable deep changes by altering relationships between insurers, distributors, and the insured. In his session at @ThingsExpo, Michael Sick, a Senior Manager and Big Data Architect within Ernst and Young's Financial Servi...
Even as cloud and managed services grow increasingly central to business strategy and performance, challenges remain. The biggest sticking point for companies seeking to capitalize on the cloud is data security. Keeping data safe is an issue in any computing environment, and it has been a focus since the earliest days of the cloud revolution. Understandably so: a lot can go wrong when you allow valuable information to live outside the firewall. Recent revelations about government snooping, along with a steady stream of well-publicized data breaches, only add to the uncertainty
The explosion of connected devices / sensors is creating an ever-expanding set of new and valuable data. In parallel the emerging capability of Big Data technologies to store, access, analyze, and react to this data is producing changes in business models under the umbrella of the Internet of Things (IoT). In particular within the Insurance industry, IoT appears positioned to enable deep changes by altering relationships between insurers, distributors, and the insured. In his session at @ThingsExpo, Michael Sick, a Senior Manager and Big Data Architect within Ernst and Young's Financial Servi...
PubNub on Monday has announced that it is partnering with IBM to bring its sophisticated real-time data streaming and messaging capabilities to Bluemix, IBM’s cloud development platform. “Today’s app and connected devices require an always-on connection, but building a secure, scalable solution from the ground up is time consuming, resource intensive, and error-prone,” said Todd Greene, CEO of PubNub. “PubNub enables web, mobile and IoT developers building apps on IBM Bluemix to quickly add scalable realtime functionality with minimal effort and cost.”
The Internet of Things (IoT) is rapidly in the process of breaking from its heretofore relatively obscure enterprise applications (such as plant floor control and supply chain management) and going mainstream into the consumer space. More and more creative folks are interconnecting everyday products such as household items, mobile devices, appliances and cars, and unleashing new and imaginative scenarios. We are seeing a lot of excitement around applications in home automation, personal fitness, and in-car entertainment and this excitement will bleed into other areas. On the commercial side, m...