Welcome!

Microservices Expo Authors: Pat Romanski, Sujoy Sen, Liz McMillan, AppDynamics Blog, VictorOps Blog

Related Topics: @CloudExpo, Java IoT, Industrial IoT, Microservices Expo, Microsoft Cloud, Containers Expo Blog

@CloudExpo: Article

Drilling the Data Wells of the Future

Human-generated Big Data could make us rich

Human-generated content is comprised of all the files and e-mails that we create every day, all the presentations, word processing documents, spreadsheets, audio files and other documents our employers ask us to produce hour-by-hour. These are the files that take up the vast majority of digital storage space in most organizations - they are kept for significant amounts of time and have huge amounts of metadata associated with them. Human-generated content is huge, and its metadata is even bigger. Metadata is the information about a file: who might have created it, what type of file it is, what folder it is stored in, who has been reading it and who has access to it. The content and metadata together make up human-generated Big Data.

The problem is that most of us, meaning organizations and governments, are not yet equipped with the tools to exploit human-generated Big Data. The conclusion of a recent survey of over 1,000 Internet experts and other Internet users, published by the Pew Research Centre and the Imagining the Internet Center at Elon University, is that the world may not be ready to properly handle and understand Big Data.[1] These experts have come to the conclusion that the huge quantities of data, which they term "digital exhaust," which will be created by the year 2020 could very well enhance productivity, improve organizational transparency and expand the frontier of the "knowable future." However, they are concerned about whose hands this information is in and whether government or corporations will use this information wisely.

The survey found that "...human and machine analysis of Big Data could improve social, political and economic intelligence by 2020. The rise of what is known as Big Data will facilitate things like real-time forecasting of events; the development of ‘inferential software' that assesses data patterns to project outcomes; and the creation of algorithms for advanced correlations that enable new understanding of the world."

Of those surveyed, 39% of the Internet experts asked agreed with the counterargument to Big Data's benefits, which posited that "Human and machine analysis of Big Data will cause more problems than it solves by 2020. The existence of huge data sets for analysis will engender false confidence in our predictive powers and will lead many to make significant and hurtful mistakes. Moreover, analysis of Big Data will be misused by powerful people and institutions with selfish agendas who manipulate findings to make the case for what they want."

As one of the study's participants, entrepreneur Bryan Trogdon put it: "Big Data is the new oil," observing that, "...the companies, governments, and organizations that are able to mine this resource will have an enormous advantage over those that don't. With speed, agility, and innovation determining the winners and losers, Big Data allows us to move from a mindset of ‘measure twice, cut once' to one of ‘place small bets fast.'"[2]

Jeff Jarvis, professor and blogger, said: "Media and regulators are demonizing Big Data and its supposed threat to privacy. Such moral panics have occurred often thanks to changes in technology. But the moral of the story remains: there is value to be found in this data, value in our newfound ability to share. Google's founders have urged government regulators not to require them to quickly delete searches because, in their patterns and anomalies, they have found the ability to track the outbreak of the flu before health officials could and they believe that by similarly tracking a pandemic, millions of lives could be saved. Demonizing data, big or small, is demonising knowledge, and that is never wise."[3]

Sean Mead, director of analytics at Mead, Mead & Clark, Interbrand said: "Large, publicly available data sets, easier tools, wider distribution of analytics skills, and early stage artificial intelligence software will lead to a burst of economic activity and increased productivity comparable to that of the Internet and PC revolutions of the mid to late 1990s. Social movements will arise to free up access to large data repositories, to restrict the development and use of AIs, and to ‘liberate' AIs."[4]

These are very interesting arguments and they do begin to get to the heart of the matter - which is that our data sets have grown beyond our ability to analyze and process them without sophisticated automation. We simply have to rely on technology to analyze and cope with this enormous wave of content and metadata.

Analyzing human-generated Big Data has enormous potential. More than potential, harnessing the power of metadata has become essential to manage and protect human-generated content. File shares, emails, and intranets have made it so easy for end users to save and share files that organizations now have more human-generated content than they can sustainably manage and protect using small data thinking. Many organizations face real problems because questions that could be answered 15 years ago on smaller, more static data sets can no longer be answered. These questions include: Where does critical data reside, who accesses it, and who should have access to it? As a consequence, IDC estimates that only half the data that should be protected is protected.

The problem is compounded with cloud-based file sharing, as these services create yet another growing store of human-generated content requiring management and protection - one that lies outside corporate infrastructure with different controls and management processes.

David Weinberger of Harvard University's Berkman Center said: "We are just beginning to understand the range of problems Big Data can solve, even though it means acknowledging that we're less unpredictable, free, madcap creatures than we'd like to think. If harnessing the power of human generated big data can make data protection and management less unpredictable, free, and madcap, organizations will be grateful.

References

  1. http://www.elon.edu/e-web/predictions/expertsurveys/2012survey/future_Big_Data_2020.xhtml
  2. http://pewinternet.org/Reports/2012/Future-of-Big-Data/Overview.aspx
  3. http://pewinternet.org/Reports/2012/Future-of-Big-Data/Overview.aspx
  4. http://pewinternet.org/Reports/2012/Future-of-Big-Data/Overview.aspx

More Stories By Rob Sobers

Rob Sobers is a designer, web developer, and technical marketing manager for Varonis, where he oversees the company's online marketing strategy. He writes a popular blog on software and security at accidentalhacker.com and is co-author of the book "Learn Ruby the Hard Way", which has been used by thousands of students to learn the Ruby programming language. Rob is a 12-year technology industry veteran and, prior to joining Varonis, held positions in software engineering, design, and professional services.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


@MicroservicesExpo Stories
Small teams are more effective. The general agreement is that anything from 5 to 12 is the 'right' small. But of course small teams will also have 'small' throughput - relatively speaking. So if your demand is X and the throughput of a small team is X/10, you probably need 10 teams to meet that demand. But more teams also mean more effort to coordinate and align their efforts in the same direction. So, the challenge is how to harness the power of small teams and yet orchestrate multiples of them...
As enterprises around the world struggle with their digital transformation efforts, many are finding that innovative digital teams are moving much faster than their hidebound IT organizations. Rather than struggling to convince traditional IT to get with the digital program, executives are taking advice from IT research firm Gartner, and encouraging existing IT to continue in their desultory ways. However, many CIOs are realizing the dangers of following Gartner’s advice. The central challenge ...
Much of the value of DevOps comes from a (renewed) focus on measurement, sharing, and continuous feedback loops. In increasingly complex DevOps workflows and environments, and especially in larger, regulated, or more crystallized organizations, these core concepts become even more critical. In his session at @DevOpsSummit at 18th Cloud Expo, Andi Mann, Chief Technology Advocate at Splunk, will show how, by focusing on 'metrics that matter,' you can provide objective, transparent, and meaningfu...
The goal of any tech business worth its salt is to provide the best product or service to its clients in the most efficient and cost-effective way possible. This is just as true in the development of software products as it is in other product design services. Microservices, an app architecture style that leans mostly on independent, self-contained programs, are quickly becoming the new norm, so to speak. With this change comes a declining reliance on older SOAs like COBRA, a push toward more s...
In a crowded world of popular computer languages, platforms and ecosystems, Node.js is one of the hottest. According to w3techs.com, Node.js usage has gone up 241 percent in the last year alone. Retailers have taken notice and are implementing it on many levels. I am going to share the basics of Node.js, and discuss why retailers are using it to reduce page load times and improve server efficiency. I’ll talk about similar developments such as Docker and microservices, and look at several compani...
In the world of DevOps there are ‘known good practices’ – aka ‘patterns’ – and ‘known bad practices’ – aka ‘anti-patterns.' Many of these patterns and anti-patterns have been developed from real world experience, especially by the early adopters of DevOps theory; but many are more feasible in theory than in practice, especially for more recent entrants to the DevOps scene. In this power panel at @DevOpsSummit at 18th Cloud Expo, moderated by DevOps Conference Chair Andi Mann, panelists will dis...
SYS-CON Events announced today that Peak 10, Inc., a national IT infrastructure and cloud services provider, will exhibit at SYS-CON's 18th International Cloud Expo®, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. Peak 10 provides reliable, tailored data center and network services, cloud and managed services. Its solutions are designed to scale and adapt to customers’ changing business needs, enabling them to lower costs, improve performance and focus inter...
Many private cloud projects were built to deliver self-service access to development and test resources. While those clouds delivered faster access to resources, they lacked visibility, control and security needed for production deployments. In their session at 18th Cloud Expo, Steve Anderson, Product Manager at BMC Software, and Rick Lefort, Principal Technical Marketing Consultant at BMC Software, will discuss how a cloud designed for production operations not only helps accelerate developer...
Last week I had the pleasure of speaking on a panel at Sapphire Ventures Next-Gen Tech Stack Forum in San Francisco. Obviously, I was excited to join the discussion, but as a participant the event crystallized not only where the larger software development market is relative to microservices, container technologies (like Docker), continuous integration and deployment; but also provided insight into where DevOps is heading in the coming years.
Wow, if you ever wanted to learn about Rugged DevOps (some call it DevSecOps), sit down for a spell with Shannon Lietz, Ian Allison and Scott Kennedy from Intuit. We discussed a number of important topics including internal war games, culture hacking, gamification of Rugged DevOps and starting as a small team. There are 100 gold nuggets in this conversation for novices and experts alike.
The notion of customer journeys, of course, are central to the digital marketer’s playbook. Clearly, enterprises should focus their digital efforts on such journeys, as they represent customer interactions over time. But making customer journeys the centerpiece of the enterprise architecture, however, leaves more questions than answers. The challenge arises when EAs consider the context of the customer journey in the overall architecture as well as the architectural elements that make up each...
Admittedly, two years ago I was a bulk contributor to the DevOps noise with conversations rooted in the movement around culture, principles, and goals. And while all of these elements of DevOps environments are important, I’ve found that the biggest challenge now is a lack of understanding as to why DevOps is beneficial. It’s getting the wheels going, or just taking the next step. The best way to start on the road to change is to take a look at the companies that have already made great headway ...
In 2006, Martin Fowler posted his now famous essay on Continuous Integration. Looking back, what seemed revolutionary, radical or just plain crazy is now common, pedestrian and "just what you do." I love it. Back then, building and releasing software was a real pain. Integration was something you did at the end, after code complete, and we didn't know how long it would take. Some people may recall how we, as an industry, spent a massive amount of time integrating code from one team with another...
From the conception of Docker containers to the unfolding microservices revolution we see today, here is a brief history of what I like to call 'containerology'. In 2013, we were solidly in the monolithic application era. I had noticed that a growing amount of effort was going into deploying and configuring applications. As applications had grown in complexity and interdependency over the years, the effort to install and configure them was becoming significant. But the road did not end with a ...
I have an article in the recently released “DZone Guide to Building and Deploying Applications on the Cloud” entitled “Fullstack Engineering in the Age of Hybrid Cloud”. In this article I discuss the need and skills of a Fullstack Engineer with relation to troubleshooting and repairing complex, distributed hybrid cloud applications. My recent experiences with troubleshooting issues with my Docker WordPress container only reinforce the details I wrote about in this piece. Without my comprehensive...
As the software delivery industry continues to evolve and mature, the challenge of managing the growing list of the tools and processes becomes more daunting every day. Today, Application Lifecycle Management (ALM) platforms are proving most valuable by providing the governance, management and coordination for every stage of development, deployment and release. Recently, I spoke with Madison Moore at SD Times about the changing market and where ALM is headed.
Much of the discussion around cloud DevOps focuses on the speed with which companies need to get new code into production. This focus is important – because in an increasingly digital marketplace, new code enables new value propositions. New code is also often essential for maintaining competitive parity with market innovators. But new code doesn’t just have to deliver the functionality the business requires. It also has to behave well because the behavior of code in the cloud affects performan...
Struggling to keep up with increasing application demand? Learn how Platform as a Service (PaaS) can streamline application development processes and make resource management easy.
If there is anything we have learned by now, is that every business paves their own unique path for releasing software- every pipeline, implementation and practices are a bit different, and DevOps comes in all shapes and sizes. Software delivery practices are often comprised of set of several complementing (or even competing) methodologies – such as leveraging Agile, DevOps and even a mix of ITIL, to create the combination that’s most suitable for your organization and that maximize your busines...
Digital means customer preferences and behavior are driving enterprise technology decisions to be sure, but let’s not forget our employees. After all, when we say customer, we mean customer writ large, including partners, supply chain participants, and yes, those salaried denizens whose daily labor forms the cornerstone of the enterprise. While your customers bask in the warm rays of your digital efforts, are your employees toiling away in the dark recesses of your enterprise, pecking data into...