Click here to close now.

Welcome!

SOA & WOA Authors: Liz McMillan, Carmen Gonzalez, Roger Strukhoff, Dana Gardner, Tim Hinds

Related Topics: Big Data Journal, SOA & WOA, Virtualization, AJAX & REA, Cloud Expo, SDN Journal

Big Data Journal: Article

From Metrics to Success

Ford scours for more big data to bolster quality, improve manufacturing, streamline processes

Ford has exploited the strengths of big data analytics by directing them internally to improve business results. In doing so, they scour the metrics from the company’s best processes across myriad manufacturing efforts and through detailed outputs from in-use automobiles -- all to improve and help transform their business.

So explains Michael Cavaretta, PhD, Technical Leader of Predictive Analytics for Ford Research and Advanced Engineering in Dearborn, Michigan. Cavaretta is one of a group of experts assembled this week for The Open Group Conference in Newport Beach, California.

Cavaretta has led multiple data-analytic projects at Ford to break down silos inside the company to best define Ford’s most fruitful data sets. Ford has successfully aggregated customer feedback, and extracted all the internal data to predict how best new features in technologies will improve their cars.

As a contributor to the The Open Group conference and its focus on "Big Data -- The Transformation We Need to Embrace Today," Cavaretta explains how big data is fostering business transformation by allowing deeper insights into more types of data efficiently, and thereby improving processes, quality control, and customer satisfaction.

The interview was moderated by Dana Gardner, Principal Analyst at Interarbor Solutions. [Disclosure: The Open Group is a sponsor of BriefingsDirect podcasts.]

Here are some excerpts:

Gardner: What's different now in being able to get at this data and do this type of analysis from five years ago?

Cavaretta: The biggest difference has to do with the cheap availability of storage and processing power, where a few years ago people were very much concentrated on filtering down the datasets that were being stored for long-term analysis. There has been a big sea change with the idea that we should just store as much as we can and take advantage of that storage to improve business processes.

Gardner: How did we get here? What's the process behind the benefits?

Sea change in attitude

Cavaretta: The process behind the benefits has to do with a sea change in the attitude of organizations, particularly IT within large enterprises. There's this idea that you don't need to spend so much time figuring out what data you want to store and worry about the cost associated with it, and more about data as an asset. There is value in being able to store it, and being able to go back and extract different insights from it. This really comes from this really cheap storage, access to parallel processing machines, and great software.

Cavaretta

I like to talk to people about the possibility that big data provides and I always tell them that I have yet to have a circumstance where somebody is giving me too much data. You can pull in all this information and then answer a variety of questions, because you don't have to worry that something has been thrown out. You have everything.

You may have 100 questions, and each one of the questions uses a very small portion of the data. Those questions may use different portions of the data, a very small piece, but they're all different. If you go in thinking, "We’re going to answer the top 20 questions and we’re just going to hold data for that," that leaves so much on the table, and you don't get any value out of it.

The process behind the benefits has to do with a sea change in the attitude of organizations, particularly IT within large enterprises.

We're a big believer in mash-ups and we really believe that there is a lot of value in being able to take even datasets that are not specifically big-data sizes yet, and then not go deep, not get more detailed information, but expand the breadth. So it's being able to augment it with other internal datasets, bridging across different business areas as well as augmenting it with external datasets.

A lot of times you can take something that is maybe a few hundred thousand records or a few million records, and then by the time you’re joining it, and appending different pieces of information onto it, you can get the big dataset sizes.

Gardner: You’re really looking primarily at internal data, while also availing yourself of what external data might be appropriate. Maybe you could describe a little bit about your organization, what you do, and why this internal focus is so important for you.

Internal consultants
Cavaretta: I'm part of a larger department that is housed over in the research and advanced-engineering area at Ford Motor Company, and we’re about 30 people. We work as internal consultants, kind of like Capgemini or Ernst & Young, but only within Ford Motor Company. We’re responsible for going out and looking for different opportunities from the business perspective to bring advanced technologies. So, we’ve been focused on the area of statistical modeling and machine learning for I’d say about 15 years or so.

And in this time, we’ve had a number of engagements where we’ve talked with different business customers, and people have said, "We'd really like to do this." Then, we'd look at the datasets that they have, and say, "Wouldn’t it be great if we would have had this. So now we have to wait six months or a year."

These new technologies are really changing the game from that perspective. We can turn on the complete fire-hose, and then say that we don't have to worry about that anymore. Everything is coming in. We can record it all. We don't have to worry about if the data doesn’t support this analysis, because it's all there. That's really a big benefit of big-data technologies.

The real value proposition definitely is changing as things are being pushed down in the company to lower-level analysts who are really interested in looking at things from a data-driven perspective. From when I first came in to now, the biggest change has been when Alan Mulally came into the company, and really pushed the idea of data-driven decisions.

The real value proposition definitely is changing as things are being pushed down in the company to lower-level analysts.

Before, we were getting a lot of interest from people who are really very focused on the data that they had internally. After that, they had a lot of questions from their management and from upper level directors and vice-president saying, "We’ve got all these data assets. We should be getting more out of them." This strategic perspective has really changed a lot of what we’ve done in the last few years.

Gardner: Are we getting to the point where this sort of Holy Grail notion of a total feedback loop across the lifecycle of a major product like an automobile is really within our grasp? Are we getting there, or is this still kind of theoretical. Can we pull it altogether and make it a science?

Cavaretta: The theory is there. The question has more to do with the actual implementation and the practicality of it. We still are talking a lot of data where even with new advanced technologies and techniques that’s a lot of data to store, it’s a lot of data to analyze, there’s a lot of data to make sure that we can mash-up appropriately.

And, while I think the potential is there and I think the theory is there. There is also a work in being able to get the data from multiple sources. So everything which you can get back from the vehicle, fantastic. Now if you marry that up with internal data, is it survey data, is it manufacturing data, is it quality data? What are the things do you want to go after first? We can’t do everything all at the same time.

Highest value

Our perspective has been let’s make sure that we identify the highest value, the greatest ROI areas, and then begin to take some of the major datasets that we have and then push them and get more detail. Mash them up appropriately and really prove up the value for the technologists.

Gardner: Clearly, there's a lot more to come in terms of where we can take this, but I suppose it's useful to have a historic perspective and context as well. I was thinking about some of the early quality gurus like Deming and some of the movement towards quality like Six Sigma. Does this fall within that same lineage? Are we talking about a continuum here over that last 50 or 60 years, or is this something different?

Cavaretta: That’s a really interesting question. From the perspective of analyzing data, using data appropriately, I think there is a really good long history, and Ford has been a big follower of Deming and Six Sigma for a number of years now.

The difference though, is this idea that you don't have to worry so much upfront about getting the data. If you're doing this right, you have the data right there, and this has some great advantages. You’ll have to wait until you get enough history to look for somebody’s patterns. Then again, it also has some disadvantage, which is you’ve got so much data that it’s easy to find things that could be spurious correlations or models that don’t make any sense.

The piece that is required is good domain knowledge, in particular when you are talking about making changes in the manufacturing plant. It's very appropriate to look at things and be able to talk with people who have 20 years of experience to say, "This is what we found in the data. Does this match what your intuition is?" Then, take that extra step.

Gardner: How has the notion of the Internet of things being brought to bear on your gathering of big data and applying it to the analytics in your organization?

Cavaretta: It is a huge area, and not only from the internal process perspective -- RFID tags within the manufacturing plans, as well as out on the plant floor, and then all of the information that’s being generated by the vehicle itself.

The Ford Energi generates about 25 gigabytes of data per hour. So you can imagine selling couple of million vehicles in the near future with that amount of data being generated. There are huge opportunities within that, and there are also some interesting opportunities having to do with opening up some of these systems for third-party developers. OpenXC is an initiative that we have going on to add at Research and Advanced Engineering.

Huge number of sensors

We have a lot of data coming from the vehicle. There’s huge number of sensors and processors that are being added to the vehicles. There's data being generated there, as well as communication between the vehicle and your cell phone and communication between vehicles.

There's a group over at Ann Arbor Michigan, the University of Michigan Transportation Research Institute (UMTRI), that’s investigating that, as well as communication between the vehicle and let’s say a home system. It lets the home know that you're on your way and it’s time to increase the temperature, if it’s winter outside, or cool it at the summer time.

The amount of data that’s been generated there is invaluable information and could be used for a lot of benefits, both from the corporate perspective, as well as just the very nature of the environment.

Gardner: Just to put a stake in the ground on this, how much data do cars typically generate? Do you have a sense of what now is the case, an average?

Cavaretta: The Energi, according to the latest information that I have, generates about 25 gigabytes per hour. Different vehicles are going to generate different amounts, depending on the number of sensors and processors on the vehicle. But the biggest key has to do with not necessarily where we are right now but where we will be in the near future.

With the amount of information that's being generated from the vehicles, a lot of it is just internal stuff. The question is how much information should be sent back for analysis and to find different patterns? That becomes really interesting as you look at external sensors, temperature, humidity. You can know when the windshield wipers go on, and then to be able to take that information, and mash that up with other external data sources too. It's a very interesting domain.

With the amount of information that's being generated from the vehicles, a lot of it is just internal stuff.

Gardner: What skills do you target for your group, and what ways do you think that you can improve on that?

Cavaretta: The skills that we have in our department, in particular on our team, are in the area of computer science, statistics, and some good old-fashioned engineering domain knowledge. We’ve really gone about this from a training perspective. Aside from a few key hires, it's really been an internally developed group.

Targeted training

The biggest advantage that we have is that we can go out and be very targeted with the amount of training that we have. There are such big tools out there, especially in the open-source realm, that we can spin things up with relatively low cost and low risk, and do a number of experiments in the area. That's really the way that we push the technologies forward.

Talking with The Open Group really gives me an opportunity to be able to bring people on board with the idea that you should be looking at a difference in mindset. It's not "Here’s a way that data is being generated, look, try and conceive of some questions that we can use, and we’ll store that too." Let's just take everything, we’ll worry about it later, and then we’ll find the value.

It's important to be thinking about data as an asset, rather than as a cost. You even have to spend some money, and it may be a little bit unsafe without really solid ROI at the beginning. Then, move towards pulling that information in, and being able to store it in a way that allows not just the high-level data scientist to get access to and provide value, but people who are interested in the data overall. Those are very important pieces.

The last one is how do you take a big-data project, how do you take something where you’re not storing in the traditional business intelligence (BI) framework that an enterprise can develop, and then connect that to the BI systems and look at providing value to those mash-ups. Those are really important areas that still need some work.

There are many companies, especially large enterprises, that are looking at their data assets and wondering what can they do to monetize this, not only to just pay for the efficiency improvement but as a new revenue stream.

Gardner: For those organizations that want to get started on this, how do you get started?

Understand that it maybe going to be a little bit more costly and the ROI isn't going to be there at the beginning.

Cavaretta: We're definitely a huge believer in pilot projects and proof of concept, and we like to develop roadmaps by doing. So get out there. Understand that it's going to be messy. Understand that it maybe going to be a little bit more costly and the ROI isn't going to be there at the beginning.

But get your feet wet. Start doing some experiments, and then, as those experiments turn from just experimentation into really providing real business value, that’s the time to start looking at a more formal aspect and more formal IT processes. But you've just got to get going at this point.

You may also be interested in:

More Stories By Dana Gardner

At Interarbor Solutions, we create the analysis and in-depth podcasts on enterprise software and cloud trends that help fuel the social media revolution. As a veteran IT analyst, Dana Gardner moderates discussions and interviews get to the meat of the hottest technology topics. We define and forecast the business productivity effects of enterprise infrastructure, SOA and cloud advances. Our social media vehicles become conversational platforms, powerfully distributed via the BriefingsDirect Network of online media partners like ZDNet and IT-Director.com. As founder and principal analyst at Interarbor Solutions, Dana Gardner created BriefingsDirect to give online readers and listeners in-depth and direct access to the brightest thought leaders on IT. Our twice-monthly BriefingsDirect Analyst Insights Edition podcasts examine the latest IT news with a panel of analysts and guests. Our sponsored discussions provide a unique, deep-dive focus on specific industry problems and the latest solutions. This podcast equivalent of an analyst briefing session -- made available as a podcast/transcript/blog to any interested viewer and search engine seeker -- breaks the mold on closed knowledge. These informational podcasts jump-start conversational evangelism, drive traffic to lead generation campaigns, and produce strong SEO returns. Interarbor Solutions provides fresh and creative thinking on IT, SOA, cloud and social media strategies based on the power of thoughtful content, made freely and easily available to proactive seekers of insights and information. As a result, marketers and branding professionals can communicate inexpensively with self-qualifiying readers/listeners in discreet market segments. BriefingsDirect podcasts hosted by Dana Gardner: Full turnkey planning, moderatiing, producing, hosting, and distribution via blogs and IT media partners of essential IT knowledge and understanding.

@ThingsExpo Stories
The Internet of Everything (IoE) brings together people, process, data and things to make networked connections more relevant and valuable than ever before – transforming information into knowledge and knowledge into wisdom. IoE creates new capabilities, richer experiences, and unprecedented opportunities to improve business and government operations, decision making and mission support capabilities. In his session at @ThingsExpo, Gary Hall, Chief Technology Officer, Federal Defense at Cisco Systems, will break down the core capabilities of IoT in multiple settings and expand upon IoE for bo...
The explosion of connected devices / sensors is creating an ever-expanding set of new and valuable data. In parallel the emerging capability of Big Data technologies to store, access, analyze, and react to this data is producing changes in business models under the umbrella of the Internet of Things (IoT). In particular within the Insurance industry, IoT appears positioned to enable deep changes by altering relationships between insurers, distributors, and the insured. In his session at @ThingsExpo, Michael Sick, a Senior Manager and Big Data Architect within Ernst and Young's Financial Servi...
SYS-CON Events announced today that Vitria Technology, Inc. will exhibit at SYS-CON’s @ThingsExpo, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. Vitria will showcase the company’s new IoT Analytics Platform through live demonstrations at booth #330. Vitria’s IoT Analytics Platform, fully integrated and powered by an operational intelligence engine, enables customers to rapidly build and operationalize advanced analytics to deliver timely business outcomes for use cases across the industrial, enterprise, and consumer segments.
The Internet of Things (IoT) is causing data centers to become radically decentralized and atomized within a new paradigm known as “fog computing.” To support IoT applications, such as connected cars and smart grids, data centers' core functions will be decentralized out to the network's edges and endpoints (aka “fogs”). As this trend takes hold, Big Data analytics platforms will focus on high-volume log analysis (aka “logs”) and rely heavily on cognitive-computing algorithms (aka “cogs”) to make sense of it all.
SYS-CON Events announced today that GENBAND, a leading developer of real time communications software solutions, has been named “Silver Sponsor” of SYS-CON's WebRTC Summit, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. The GENBAND team will be on hand to demonstrate their newest product, Kandy. Kandy is a communications Platform-as-a-Service (PaaS) that enables companies to seamlessly integrate more human communications into their Web and mobile applications - creating more engaging experiences for their customers and boosting collaboration and productiv...
From telemedicine to smart cars, digital homes and industrial monitoring, the explosive growth of IoT has created exciting new business opportunities for real time calls and messaging. In his session at @ThingsExpo, Ivelin Ivanov, CEO and Co-Founder of Telestax, shared some of the new revenue sources that IoT created for Restcomm – the open source telephony platform from Telestax. Ivelin Ivanov is a technology entrepreneur who founded Mobicents, an Open Source VoIP Platform, to help create, deploy, and manage applications integrating voice, video and data. He is the co-founder of TeleStax, a...
The industrial software market has treated data with the mentality of “collect everything now, worry about how to use it later.” We now find ourselves buried in data, with the pervasive connectivity of the (Industrial) Internet of Things only piling on more numbers. There’s too much data and not enough information. In his session at @ThingsExpo, Bob Gates, Global Marketing Director, GE’s Intelligent Platforms business, to discuss how realizing the power of IoT, software developers are now focused on understanding how industrial data can create intelligence for industrial operations. Imagine ...
The explosion of connected devices / sensors is creating an ever-expanding set of new and valuable data. In parallel the emerging capability of Big Data technologies to store, access, analyze, and react to this data is producing changes in business models under the umbrella of the Internet of Things (IoT). In particular within the Insurance industry, IoT appears positioned to enable deep changes by altering relationships between insurers, distributors, and the insured. In his session at @ThingsExpo, Michael Sick, a Senior Manager and Big Data Architect within Ernst and Young's Financial Servi...
One of the biggest impacts of the Internet of Things is and will continue to be on data; specifically data volume, management and usage. Companies are scrambling to adapt to this new and unpredictable data reality with legacy infrastructure that cannot handle the speed and volume of data. In his session at @ThingsExpo, Don DeLoach, CEO and president of Infobright, will discuss how companies need to rethink their data infrastructure to participate in the IoT, including: Data storage: Understanding the kinds of data: structured, unstructured, big/small? Analytics: What kinds and how responsiv...
The 3rd International @ThingsExpo, co-located with the 16th International Cloud Expo - to be held June 9-11, 2015, at the Javits Center in New York City, NY - is now accepting submissions to demo smart cars on the Expo Floor. Smart car sponsorship benefits include general brand exposure and increasing engagement with the developer ecosystem.
Operational Hadoop and the Lambda Architecture for Streaming Data Apache Hadoop is emerging as a distributed platform for handling large and fast incoming streams of data. Predictive maintenance, supply chain optimization, and Internet-of-Things analysis are examples where Hadoop provides the scalable storage, processing, and analytics platform to gain meaningful insights from granular data that is typically only valuable from a large-scale, aggregate view. One architecture useful for capturing and analyzing streaming data is the Lambda Architecture, representing a model of how to analyze rea...
Since 2008 and for the first time in history, more than half of humans live in urban areas, urging cities to become “smart.” Today, cities can leverage the wide availability of smartphones combined with new technologies such as Beacons or NFC to connect their urban furniture and environment to create citizen-first services that improve transportation, way-finding and information delivery. In her session at @ThingsExpo, Laetitia Gazel-Anthoine, CEO of Connecthings, will focus on successful use cases.
Sensor-enabled things are becoming more commonplace, precursors to a larger and more complex framework that most consider the ultimate promise of the IoT: things connecting, interacting, sharing, storing, and over time perhaps learning and predicting based on habits, behaviors, location, preferences, purchases and more. In his session at @ThingsExpo, Tom Wesselman, Director of Communications Ecosystem Architecture at Plantronics, will examine the still nascent IoT as it is coalescing, including what it is today, what it might ultimately be, the role of wearable tech, and technology gaps stil...
Sensor-enabled things are becoming more commonplace, precursors to a larger and more complex framework that most consider the ultimate promise of the IoT: things connecting, interacting, sharing, storing, and over time perhaps learning and predicting based on habits, behaviors, location, preferences, purchases and more. In his session at @ThingsExpo, Tom Wesselman, Director of Communications Ecosystem Architecture at Plantronics, will examine the still nascent IoT as it is coalescing, including what it is today, what it might ultimately be, the role of wearable tech, and technology gaps stil...
SYS-CON Events announced today that SoftLayer, an IBM company, has been named “Gold Sponsor” of SYS-CON's 16th International Cloud Expo®, which will take place June 9-11, 2015 at the Javits Center in New York City, NY, and the 17th International Cloud Expo®, which will take place November 3–5, 2015 at the Santa Clara Convention Center in Santa Clara, CA. SoftLayer operates a global cloud infrastructure platform built for Internet scale. With a global footprint of data centers and network points of presence, SoftLayer provides infrastructure as a service to leading-edge customers ranging from ...
SYS-CON Events announced today that Open Data Centers (ODC), a carrier-neutral colocation provider, will exhibit at SYS-CON's 16th International Cloud Expo®, which will take place June 9-11, 2015, at the Javits Center in New York City, NY. Open Data Centers is a carrier-neutral data center operator in New Jersey and New York City offering alternative connectivity options for carriers, service providers and enterprise customers.
When it comes to the Internet of Things, hooking up will get you only so far. If you want customers to commit, you need to go beyond simply connecting products. You need to use the devices themselves to transform how you engage with every customer and how you manage the entire product lifecycle. In his session at @ThingsExpo, Sean Lorenz, Technical Product Manager for Xively at LogMeIn, will show how “product relationship management” can help you leverage your connected devices and the data they generate about customer usage and product performance to deliver extremely compelling and reliabl...
There’s Big Data, then there’s really Big Data from the Internet of Things. IoT is evolving to include many data possibilities like new types of event, log and network data. The volumes are enormous, generating tens of billions of logs per day, which raise data challenges. Early IoT deployments are relying heavily on both the cloud and managed service providers to navigate these challenges. Learn about IoT, Big Data and deployments processing massive data volumes from wearables, utilities and other machines.
The true value of the Internet of Things (IoT) lies not just in the data, but through the services that protect the data, perform the analysis and present findings in a usable way. With many IoT elements rooted in traditional IT components, Big Data and IoT isn’t just a play for enterprise. In fact, the IoT presents SMBs with the prospect of launching entirely new activities and exploring innovative areas. CompTIA research identifies several areas where IoT is expected to have the greatest impact.
Wearable devices have come of age. The primary applications of wearables so far have been "the Quantified Self" or the tracking of one's fitness and health status. We propose the evolution of wearables into social and emotional communication devices. Our BE(tm) sensor uses light to visualize the skin conductance response. Our sensors are very inexpensive and can be massively distributed to audiences or groups of any size, in order to gauge reactions to performances, video, or any kind of presentation. In her session at @ThingsExpo, Jocelyn Scheirer, CEO & Founder of Bionolux, will discuss ho...