|By Dana Gardner||
|February 28, 2013 08:00 AM EST||
We recently assembled a panel of experts to explore how big data changes the status quo for architecting the enterprise. The bottom line from the discussion is that large enterprises should not just wade into big data as an isolated function, but should anticipate the strategic effects and impacts of big data -- as well the simultaneous complicating factors of cloud computing and mobile -- as soon as possible.
The panel consisted of Robert Weisman, CEO and Chief Enterprise Architect at Build The Vision; Andras Szakal, Vice President and CTO of IBM's Federal Division; Jim Hietala, Vice President for Security at The Open Group, and Chris Gerty, Deputy Program Manager at the Open Innovation Program at NASA. I served as the moderator.
And this special BriefingsDirect thought leadership interview series comes to you in conjunction with The Open Group Conference recently held in Newport Beach, California. The conference focused on "big data -- he transformation we need to embrace today." [Disclosure: The Open Group is a sponsor of this and other BriefingsDirect podcasts.]
An interesting thread for me throughout the conference was to factor where big data begins and plain old data, if you will, ends. Of course, it's going to vary quite a bit from organization to organization.
But Gerty from NASA, part of our panel, provided a good example: It’s when you run out of gas with your old data methods, and your ability to deal with the data -- and it's not just the size of the data itself.
Therefore, big data means do things differently -- not just to manage the velocity and the volume and the variety of the data, but to really think about data fundamentally and differently. And, we need to think about security, risk and governance. If it's a "boundaryless organization" when it comes your data, either as a product or service or a resource, that control and management of which data should be exposed, which should be opened, and which should be very closely guarded all need to be factored, determined and implemented.
Here are some excerpts from the on-stage discussion:
Dana Gardner: You mentioned that big data to you is not a factor of the size, because NASA's dealing with so much. It’s when you run out of steam, as it were, with the methodologies. Maybe you could explain more. When do you know that you've actually run out of steam with the methodologies?
Gerty: When we collect data, we have some sort of goal in minds of what we might get out of it. When we put the pieces from the data together, it either maybe doesn't fit as well as you thought or you are successful and you continue to do the same thing, gathering archives of information.
At that point, where you realize there might even something else that you want to do with the data, different than what you planned originally, that’s when we have to pivot a little bit and say, "Now I need to treat this as a living archive. It's a 'it may live beyond me' type of thing." At that point, I think you treat it as setting up the infrastructure for being used later, whether it’d be by you or someone else. That's an important transition to make and might be what one could define as big data.
Gardner: Andras, does that square with where you are in your government interactions -- that data now becomes a different type of resource, and that you need to know when to do things differently?
Szakal: The importance of data hasn’t changed. The data itself, the veracity of the data, is still important. Transactional data will always need to exist. The difference is that you have certainly the three or four Vs, depending on how you look at it, but the importance of data is in its veracity, and your ability to understand or to be able to use that data before the data's shelf life runs out.
Some data has a shelf life that's long lived. Other data has very little shelf life, and you would use different approaches to being able to utilize that information. It's ultimately not about the data itself, but it’s about gaining deep insight into that data. So it’s not storing data or manipulating data, but applying those analytical capabilities to data.
Gardner: Bob, we've seen the price points on storage go down so dramatically. We've seem people just decide to hold on to data that they wouldn’t have before, simply because they can and they can afford to do so. That means we need to try to extract value and use that data. From the perspective of an enterprise architect, how are things different now, vis-à-vis this much larger set of data and variety of data, when it comes to planning and executing as architects?
Weisman: One of the major issues is that normally organizations are holding two orders of magnitude more data then they need. It’s an huge overhead, both in terms of the applications architecture that has a code basis, larger than it should be, and also from the technology architecture that is supporting a horrendous number of servers and a whole bunch of technology stuff that they don't need.
The issue for the architect is to figure out as what data is useful, institute a governance process, so that you can have data lifecycle management, have a proper disposition, focus the organization on information data and knowledge that is basically going to provide business value to the organization, and help them innovate and have a competitive advantage.
Can't afford it
And in terms of government, just improve service delivery, because there's waste right now on information infrastructure, and we can’t afford it anymore.
Gardner: So it's difficult to know what to keep and what not to keep. I've actually spoken to a few people lately who want to keep everything, just because they want to mine it, and they are willing to spend the money and effort to do that.
Jim Hietala, when people do get to this point of trying to decide what to keep, what not to keep, and how to architect properly for that, they also need to factor in security. It shouldn't become later in the process. It should come early. What are some of the precepts that you think are important in applying good security practices to big data?
Hietala: One of the big challenges is that many of the big-data platforms weren’t built from the get-go with security in mind. So some of the controls that you've had available in your relational databases, for instance, you move over to the big data platforms and the access control authorizations and mechanisms are not there today.
Planning the architecture, looking at bringing in third-party controls to give you the security mechanisms that you are used to in your older platforms, is something that organizations are going to have to do. It’s really an evolving and emerging thing at this point.
Gardner: There are a lot of unknown unknowns out there, as we discovered with our tweet chat last month. Some people think that the data is just data, and you apply the same security to it. Do you think that’s the case with big data? Is it just another follow-through of what you always did with data in the first place?
Hietala: I would say yes, at a conceptual level, but it's like what we saw with virtualization. When there was a mad rush to virtualize everything, many of those traditional security controls didn't translate directly into the virtualized world. The same thing is true with big data.
When you're talking about those volumes of data, applying encryption, applying various security controls, you have to think about how those things are going to scale? That may require new solutions from new technologies and that sort of thing.
Gardner: Chris Gerty, when it comes to that governance, security, and access control, are there any lessons that you've learned that you are aware of in terms of the best of openness, but also with the ability to manage the spigot?
Gerty: Spigot is probably a dangerous term to use, because it implies that all data is treated the same. The sooner that you can tag the data as either sensitive or not, mostly coming from the person or team that's developed or originated the data, the better.
Kicking the can
Once you have it on a hard drive, once you get crazy about storing everything, if you don't know where it came from, you're forced to put it into a secure environment. And that's just kicking the can down the road. It’s really a disservice to people who might use the data in a useful way to address their problems.
We constantly have satellites that are made for one purpose. They send all the data down. It’s controlled either for security or for intellectual property (IP), so someone can write a paper. Then, after the project doesn’t get funded or it just comes to a nice graceful close, there is that extra step, which is almost a responsibility of the originators, to make it useful to the rest of the world.
Gardner: Let’s look at big data through the lens of some other major trends right now. Let’s start with cloud. You mentioned that at NASA, you have your own private cloud that you're using a lot, of course, but you're also now dabbling in commercial and public clouds. Frankly, the price points that these cloud providers are offering for storage and data services are pretty compelling.
So we should expect more data to go to the cloud. Bob, from your perspective, as organizations and architects have to think about data in this hybrid cloud on-premises off-premises, moving back and forth, what do you think enterprise architects need to start thinking about in terms of managing that, planning for the right destination of data, based on the right mix of other requirements?
Weisman: It's a good question. As you said, the price point is compelling, but the security and privacy of the information is something else that has to be taken into account. Where is that information going to reside? You have to have very stringent service-level agreements (SLAs) and in certain cases, you might say it's a price point that’s compelling, but the risk analysis that I have done means that I'm going to have to set up my own private cloud.
Right now, everybody's saying is the public cloud is going to be the way to go. Vendors are going to have to be very sensitive to that and many are, at this point in time, addressing a lot of the needs of some of the large client basis. So it’s not one-size-fits-all and it’s more than just a price for service. Architecture can bring down the price pretty dramatically, even within an enterprise.
Gardner: Andras, how do the cloud and big data come together in a way that’s intriguing to you?
Szakal: Actually it’s a great question. We could take the rest of the 22 minutes talking on this one question. I helped lead the President’s Commission on big data that Steve Mills from IBM and -- I forget the name of the executive from SAP -- led. We intentionally tried to separate cloud from big data architecture, primarily because we don't believe that, in all cases, cloud is the answer to all things big data. You have to define the architecture that's appropriate for your business needs.
However, it also depends on where the data is born. Take many of the investments IBM has made into enterprise market management, for example, Coremetrics, several of these services that we now offer for helping customers understand deep insight into how their retail market or supply chain behaves.
Born in the cloud
All of that information is born in the cloud. But if you're talking about actually using cloud as infrastructure and moving around huge sums of data or constructing some of these solutions on your own, then some of the ideas that Bob conveyed are absolutely applicable.
I think it becomes prohibitive to do that and easier to stand up a hybrid environment for managing the amount of data. But I think that you have to think about whether your data is real-time data, whether it's data that you could apply some of these new technologies like Hadoop to, Hadoop MapReduce-type solutions, or whether it's traditional data warehousing.
Data warehouses are going to continue to exist and they're going to continue to evolve technologically. You're always going to use a subset of data in those data warehouses, and it's going to be an applicable technology for many years to come.
Gardner: So suffice it to say, an enterprise architect who is well versed in both cloud infrastructure requirements, technologies, and methods, as well as big data, will probably be in quite high demand. That specialization in one or the other isn’t as valuable as being able to cross-pollinate between them.
Szakal: Absolutely. It's enabling our architects and finding deep individuals who have this unique set of skills, analytics, mathematics, and business. Those individuals are going to be the future architects of the IT world, because analytics and big data are going to be integrated into everything that we do and become part of the business processing.
Gardner: Well, that’s a great segue to the next topic that I am interested in, and it's around mobility as a trend and also application development. The reason I lump them together is that I increasingly see developers being tasked with mobile first.
When you create a new app, you have to remember that this is going to run in the mobile tier and you want to make sure that the requirements, the UI, and the complexity of that app don’t go beyond the ability of the mobile app and the mobile user. This is interesting to me, because data now has a different relationship with apps.
We used to think of apps as creating data and then the data would be stored and it might be used or integrated. Now, we have applications that are simply there in order to present the data and we have the ability now to present it to those mobile devices in the mobile tier, which means it goes anywhere, everywhere all the time.
Let me start with you Jim, because it’s security and risk, but it's also just rethinking the way we use data in a mobile tier. If we can do it safely, and that’s a big IF, how important should it be for organizations to start thinking about making this data available to all of these devices and just pour out into that mobile tier as possible?
Hietala: In terms of enabling the business, it’s very important. There are a lot of benefits that accrue from accessing your data from whatever device you happen to be on. To me, it is that question of "if," because now there’s a whole lot of problems to be solved relative to the data floating around anywhere on Android, iOS, whatever the platform is, and the organization being able to lock down their data on those devices, forgetting about whether it’s the organization device or my device. There’s a set of issues around that that the security industry is just starting to get their arms around today.
Gardner: Chris, any thoughts about this mobile ability that the data gets more valuable the more you can use it and apply it, and then the more you can apply it, the more data you generate that makes the data more valuable, and we start getting into that positive feedback loop?
Gerty: Absolutely. It's almost an appreciation of what more people could do and get to the problem. We're getting to the point where, if it's available on your desktop, you’re going to find a way to make it available on your device.
That same security questions probably need to be answered anyway, but making it mobile compatible is almost an acknowledgment that there will be someone who wants to use it. So let me go that extra step to make it compatible and see what I get from them. It's more of a cultural benefit that you get from making things compatible with mobile.
Gardner: Any thoughts about what developers should be thinking by trying to bring the fruits of big data through these analytics to more users rather than just the BI folks or those that are good at SQL queries? Does this change the game by actually making an application on a mobile device, simple, powerful but accessing this real time updated treasure trove of data?
Gerty: I always think of the astronaut on the moon. He's got a big, bulky glove and he might have a heads-up display in front of him, but he really needs to know exactly a certain piece of information at the right moment, dealing with bandwidth issues, dealing with the environment, foggy helmet wherever.
It's very analogous to what the day-to-day professional will use trying to find out that quick e-mail he needs to know or which meeting to go to -- which one is more important -- and it all comes down to putting your developer in the shoes of the user. So anytime you can get interaction between the two, that’s valuable.
Weisman: From an enterprise architecture point of view my background is mainly defense and government, but defense mobile computing has been around for decades. So you've always been dealing with that.
The main thing is that in many cases, if they're coming up with information, the whole presentation layer is turning into another architecture domain with information visualization and also with your security controls, with an integrated identity management capability.
It's like you were saying about astronaut getting it right. He doesn't need to know everything that’s happening in the world. He needs to know about his heads-up display, the stuff that's relevant to him.
So it's getting the right information to person in an authorized manner, in a way that he can visualize and make sense of that information, be it straight data, analytics, or whatever. The presentation layer, ergonomics, visual communication are going to become very important in the future for that. There are also a lot of problems. Rather than doing it at the application level, you're doing it entirely in one layer.
Governance and security
Gardner: So clearly the implications of data are cutting across how we think about security, how we think about UI, how we factor in mobility. What we now think about in terms of governance and security, we have to do differently than we did with older data models.
Jim Hietala, what about the impact on spurring people towards more virtualized desktop delivery, if you don't want to have the date on that end device, if you want solve some of the issues about control and governance, and if you want to be able to manage just how much data gets into that UI, not too much not too little.
Do you think that some of these concerns that we’re addressing will push people to look even harder, maybe more aggressive in how they go to desktop and application virtualization, as they say, keep it on the server, deliver out just the deltas?
Hietala: That’s an interesting point. I’ve run across a startup in the last month or two that is doing is that. The whole value proposition is to virtualize the environment. You get virtual gold images. You don't have to worry about what's actually happening on the physical device and you know when the devices connect. The security threat goes away. So we may see more of that as a solution to that.
Gardner: Andras, do you see that that some of the implications of big data, far fetched as it may be, are propelling people to cultivate their servers more and virtualize their apps, their data, and their desktop right up to the end devices?
Szakal: Yeah, I do. I see IBM providing solutions for virtual desktop, but I think it was really a security question you were asking. You're certainly going to see an additional number of virtualized desktop environments.
Ultimately, our network still is not stable enough or at a high enough bandwidth to really make that useful exercise for all but the most menial users in the enterprise. From a security point of view, there is a lot to be still solved.
And part of the challenge in the cloud environment that we see today is the proliferation of virtual machines (VMs) and the inability to actually contain the security controls within those machines and across these machines from an enterprise perspective. So we're going to see more solutions proliferate in this area and to try to solve some of the management issues, as well as the security issues, but we're a long ways away from that.
Gerty: Big data, by itself, isn't magical. It doesn't have the answers just by being big. If you need more, you need to pry deeper into it. That’s the example. They realized early enough that they were able to make something good.
Gardner: Jim Hietala, any thoughts about examples that illustrate where we’re going and why this is so important?
Hietala: Being a security guy, I tend to talk about scare stories, horror stories. One example from last year that struck me. One of the major retailers here in the U.S. hit the news for having predicted, through customer purchase behavior, when people were pregnant.
They could look and see, based upon buying 20 things, that if you're buying 15 of these and your purchase behavior has changed, they can tell that. The privacy implications to that are somewhat concerning.
An example was that this retailer was sending out coupons related to somebody being pregnant. The teenage girl, who was pregnant hadn't told her family yet. The father found it. There was alarm in the household and at the local retailer store, when the father went and confronted them.
There are privacy implications from the use of big data. When you get powerful new technology in marketing people's hands, things sometimes go awry. So I'd throw that out just as a cautionary tale that there is that aspect to this. When you can see across people's buying transactions, things like that, there are privacy considerations that we’ll have to think about, and that we really need to think about as an industry and a society.
Watch the entire video here.
You may also be interested in:
- Using the Cloud for Big-Data Requires a New Recipe
- Big Data Success Depends on Better Risk Management Practices Like FAIR, Say The Open Group Panelists
- The Open Group Keynoter Sees Big-Data Analytics Bolstering Quality, Manufacturing, Processes
- The Open Group Trusted Technology Forum is Leading the Way to Securing GLobal IT Supply Chains
- Corporate Data, Supply Chains Remain Vulnerable to Cyber Crime Attacks Says Open Group Conference Speaker
- Open Group Conference Speakers Discuss the Cloud: Higher Risk or Better Security?
- Capgemini's CTO on Why Cloud Computing Exposes the Duality Between IT and Business
Just last week a senior Hybris consultant shared the story of a customer engagement on which he was working. This customer had problems, serious problems. We’re talking about response times far beyond the most liberal acceptable standard. They were unable to solve the issue in their eCommerce platform – specifically Hybris. Although the eCommerce project was delivered by a system integrator / implementation partner, the vendor still gets involved when things go really wrong. After all, the vendo...
May. 27, 2016 11:45 PM EDT Reads: 1,431
SYS-CON Events announced today that EastBanc Technologies will exhibit at SYS-CON's 18th International Cloud Expo®, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. EastBanc Technologies has been working at the frontier of technology since 1999. Today, the firm provides full-lifecycle software development delivering flexible technology solutions that seamlessly integrate with existing systems – whether on premise or cloud. EastBanc Technologies partners with p...
May. 27, 2016 11:30 PM EDT Reads: 2,275
SYS-CON Events announced today that AppNeta, the leader in performance insight for business-critical web applications, will exhibit and present at SYS-CON's @DevOpsSummit at Cloud Expo New York, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. AppNeta is the only application performance monitoring (APM) company to provide solutions for all applications – applications you develop internally, business-critical SaaS applications you use and the networks that deli...
May. 27, 2016 11:00 PM EDT Reads: 2,478
The pace of innovation, vendor lock-in, production sustainability, cost-effectiveness, and managing risk… In his session at 18th Cloud Expo, Dan Choquette, Founder of RackN, will discuss how CIOs are challenged finding the balance of finding the right tools, technology and operational model that serves the business the best. He will discuss how clouds, open source software and infrastructure solutions have benefits but also drawbacks and how workload and operational portability between vendors...
May. 27, 2016 11:00 PM EDT Reads: 1,857
SYS-CON Events announced today that BMC Software has been named "Siver Sponsor" of SYS-CON's 18th Cloud Expo, which will take place on June 7-9, 2015 at the Javits Center in New York, New York. BMC is a global leader in innovative software solutions that help businesses transform into digital enterprises for the ultimate competitive advantage. BMC Digital Enterprise Management is a set of innovative IT solutions designed to make digital business fast, seamless, and optimized from mainframe to mo...
May. 27, 2016 10:30 PM EDT Reads: 2,185
SYS-CON Events announced today that Tintri Inc., a leading producer of VM-aware storage (VAS) for virtualization and cloud environments, will exhibit at the 18th International CloudExpo®, which will take place on June 7-9, 2016, at the Javits Center in New York City, New York, and the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA.
May. 27, 2016 10:00 PM EDT Reads: 2,385
In the rush to compete in the digital age, a successful digital transformation is essential, but many organizations are setting themselves up for failure. There’s a common misconception that the process is just about technology, but it’s not. It’s about your business. It shouldn’t be treated as an isolated IT project; it should be driven by business needs with the committed involvement of a range of stakeholders.
May. 27, 2016 09:00 PM EDT Reads: 2,479
While there has been much ado about interoperability, there are still no real solutions, same as last year and the year before that. The large EHR vendors who continue to dominate the market still maintain that interoperability is all but solved, still can't connect EHRs across the continuum causing frustration by providers and a disservice to patients. The ONC pays lip service to the problem, but that is about it. It is time for the healthcare industry to consider alternatives like middleware w...
May. 27, 2016 09:00 PM EDT Reads: 1,633
SYS-CON Events announced today that Isomorphic Software will exhibit at SYS-CON's [email protected] at Cloud Expo New York, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. Isomorphic Software provides the SmartClient HTML5/AJAX platform, the most advanced technology for building rich, high-productivity enterprise web applications for any device. SmartClient couples the industry’s broadest, deepest UI component set with a java server framework to deliver an end-...
May. 27, 2016 07:00 PM EDT Reads: 2,159
Our CTO, Anders Wallgren, recently sat down to take part in the “B2B Nation: IT” podcast — the series dedicated to serving the IT professional community with expert opinions and advice on the world of information technology. Listen to the great conversation, where Anders shares his thoughts on DevOps lessons from large enterprises, the growth of microservices and containers, and more.
May. 27, 2016 06:00 PM EDT Reads: 1,539
SYS-CON Events announced today that Commvault, a global leader in enterprise data protection and information management, has been named “Bronze Sponsor” of SYS-CON's 18th International Cloud Expo, which will take place on June 7–9, 2016, at the Javits Center in New York City, NY, and the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Commvault is a leading provider of data protection and information management...
May. 27, 2016 06:00 PM EDT Reads: 3,145
SoftLayer operates a global cloud infrastructure platform built for Internet scale. With a global footprint of data centers and network points of presence, SoftLayer provides infrastructure as a service to leading-edge customers ranging from Web startups to global enterprises. SoftLayer's modular architecture, full-featured API, and sophisticated automation provide unparalleled performance and control. Its flexible unified platform seamlessly spans physical and virtual devices linked via a world...
May. 27, 2016 03:30 PM EDT Reads: 2,203
IoT generates lots of temporal data. But how do you unlock its value? How do you coordinate the diverse moving parts that must come together when developing your IoT product? What are the key challenges addressed by Data as a Service? How does cloud computing underlie and connect the notions of Digital and DevOps What is the impact of the API economy? What is the business imperative for Cognitive Computing? Get all these questions and hundreds more like them answered at the 18th Cloud Expo...
May. 27, 2016 12:00 PM EDT Reads: 2,229
SYS-CON Events announced today the Docker Meets Kubernetes – Intro into the Kubernetes World, being held June 9, 2016, in conjunction with 18th Cloud Expo | @ThingsExpo, at the Javits Center in New York, NY. Register for 'Docker Meets Kubernetes Workshop' Here! This workshop led by Sebastian Scheele, co-founder of Loodse, introduces participants to Kubernetes (container orchestration). Through a combination of instructor-led presentations, demonstrations, and hands-on labs, participants learn ...
May. 27, 2016 12:00 PM EDT Reads: 1,968
The initial debate is over: Any enterprise with a serious commitment to IT is migrating to the cloud. But things are not so simple. There is a complex mix of on-premises, colocated, and public-cloud deployments. In this power panel at 18th Cloud Expo, moderated by Conference Chair Roger Strukhoff, panelists will look at the present state of cloud from the C-level view, and how great companies and rock star executives can use cloud computing to meet their most ambitious and disruptive business ...
May. 27, 2016 09:45 AM EDT Reads: 2,213
Agile teams report the lowest rate of measuring non-functional requirements. What does this mean for the evolution of quality in this era of Continuous Everything? To explore how the rise of SDLC acceleration trends such as Agile, DevOps, and Continuous Delivery are impacting software quality, Parasoft conducted a survey about measuring and monitoring non-functional requirements (NFRs). Here's a glimpse at what we discovered and what it means for the evolution of quality in this era of Continuo...
May. 27, 2016 09:45 AM EDT Reads: 1,554
Join us at Cloud Expo | @ThingsExpo 2016 – June 7-9 at the Javits Center in New York City and November 1-3 at the Santa Clara Convention Center in Santa Clara, CA – and deliver your unique message in a way that is striking and unforgettable by taking advantage of SYS-CON's unmatched high-impact, result-driven event / media packages.
May. 27, 2016 08:00 AM EDT Reads: 2,386
You might already know them from theagileadmin.com, but let me introduce you to two of the leading minds in the Rugged DevOps movement: James Wickett and Ernest Mueller. Both James and Ernest are active leaders in the DevOps space, in addition to helping organize events such as DevOpsDays Austinand LASCON. Our conversation covered a lot of bases from the founding of Rugged DevOps to aligning organizational silos to lessons learned from W. Edwards Demings.
May. 27, 2016 07:30 AM EDT Reads: 1,428
Application development and delivery methods have undergone radical changes in recent years to improve scalability and resiliency. Container images are the new build and deployment artifacts that are used to ship and run software. While startups have long been comfortable experimenting with and embracing new technologies, even large enterprises are now re-architecting their software systems so that they can benefit from container-enabled micro services architectures. With the launch of DC/OS, w...
May. 27, 2016 07:00 AM EDT Reads: 1,454
Earlier this week, we hosted a Continuous Discussion (#c9d9) on Continuous Delivery (CD) automation and orchestration, featuring expert panelists Dondee Tan, Test Architect at Alaska Air, Taco Bakker, a LEAN Six Sigma black belt focusing on CD, and our own Sam Fell and Anders Wallgren. During this episode, we discussed the differences between CD automation and orchestration, their challenges with setting up CD pipelines and some of the common chokepoints, as well as some best practices and tips...
May. 27, 2016 06:45 AM EDT Reads: 1,373