| By Rob Sobers | Article Rating: |
|
| October 29, 2012 10:00 AM EDT | Reads: |
3,239 |
Human-generated content is comprised of all the files and e-mails that we create every day, all the presentations, word processing documents, spreadsheets, audio files and other documents our employers ask us to produce hour-by-hour. These are the files that take up the vast majority of digital storage space in most organizations - they are kept for significant amounts of time and have huge amounts of metadata associated with them. Human-generated content is huge, and its metadata is even bigger. Metadata is the information about a file: who might have created it, what type of file it is, what folder it is stored in, who has been reading it and who has access to it. The content and metadata together make up human-generated Big Data.
The problem is that most of us, meaning organizations and governments, are not yet equipped with the tools to exploit human-generated Big Data. The conclusion of a recent survey of over 1,000 Internet experts and other Internet users, published by the Pew Research Centre and the Imagining the Internet Center at Elon University, is that the world may not be ready to properly handle and understand Big Data.[1] These experts have come to the conclusion that the huge quantities of data, which they term "digital exhaust," which will be created by the year 2020 could very well enhance productivity, improve organizational transparency and expand the frontier of the "knowable future." However, they are concerned about whose hands this information is in and whether government or corporations will use this information wisely.

The survey found that "...human and machine analysis of Big Data could improve social, political and economic intelligence by 2020. The rise of what is known as Big Data will facilitate things like real-time forecasting of events; the development of ‘inferential software' that assesses data patterns to project outcomes; and the creation of algorithms for advanced correlations that enable new understanding of the world."
Of those surveyed, 39% of the Internet experts asked agreed with the counterargument to Big Data's benefits, which posited that "Human and machine analysis of Big Data will cause more problems than it solves by 2020. The existence of huge data sets for analysis will engender false confidence in our predictive powers and will lead many to make significant and hurtful mistakes. Moreover, analysis of Big Data will be misused by powerful people and institutions with selfish agendas who manipulate findings to make the case for what they want."
As one of the study's participants, entrepreneur Bryan Trogdon put it: "Big Data is the new oil," observing that, "...the companies, governments, and organizations that are able to mine this resource will have an enormous advantage over those that don't. With speed, agility, and innovation determining the winners and losers, Big Data allows us to move from a mindset of ‘measure twice, cut once' to one of ‘place small bets fast.'"[2]
Jeff Jarvis, professor and blogger, said: "Media and regulators are demonizing Big Data and its supposed threat to privacy. Such moral panics have occurred often thanks to changes in technology. But the moral of the story remains: there is value to be found in this data, value in our newfound ability to share. Google's founders have urged government regulators not to require them to quickly delete searches because, in their patterns and anomalies, they have found the ability to track the outbreak of the flu before health officials could and they believe that by similarly tracking a pandemic, millions of lives could be saved. Demonizing data, big or small, is demonising knowledge, and that is never wise."[3]
Sean Mead, director of analytics at Mead, Mead & Clark, Interbrand said: "Large, publicly available data sets, easier tools, wider distribution of analytics skills, and early stage artificial intelligence software will lead to a burst of economic activity and increased productivity comparable to that of the Internet and PC revolutions of the mid to late 1990s. Social movements will arise to free up access to large data repositories, to restrict the development and use of AIs, and to ‘liberate' AIs."[4]
These are very interesting arguments and they do begin to get to the heart of the matter - which is that our data sets have grown beyond our ability to analyze and process them without sophisticated automation. We simply have to rely on technology to analyze and cope with this enormous wave of content and metadata.
Analyzing human-generated Big Data has enormous potential. More than potential, harnessing the power of metadata has become essential to manage and protect human-generated content. File shares, emails, and intranets have made it so easy for end users to save and share files that organizations now have more human-generated content than they can sustainably manage and protect using small data thinking. Many organizations face real problems because questions that could be answered 15 years ago on smaller, more static data sets can no longer be answered. These questions include: Where does critical data reside, who accesses it, and who should have access to it? As a consequence, IDC estimates that only half the data that should be protected is protected.
The problem is compounded with cloud-based file sharing, as these services create yet another growing store of human-generated content requiring management and protection - one that lies outside corporate infrastructure with different controls and management processes.
David Weinberger of Harvard University's Berkman Center said: "We are just beginning to understand the range of problems Big Data can solve, even though it means acknowledging that we're less unpredictable, free, madcap creatures than we'd like to think. If harnessing the power of human generated big data can make data protection and management less unpredictable, free, and madcap, organizations will be grateful.
References
- http://www.elon.edu/e-web/predictions/expertsurveys/2012survey/future_Big_Data_2020.xhtml
- http://pewinternet.org/Reports/2012/Future-of-Big-Data/Overview.aspx
- http://pewinternet.org/Reports/2012/Future-of-Big-Data/Overview.aspx
- http://pewinternet.org/Reports/2012/Future-of-Big-Data/Overview.aspx
Published October 29, 2012 Reads 3,239
Copyright © 2012 SYS-CON Media, Inc. — All Rights Reserved.
Syndicated stories and blog feeds, all rights reserved by the author.
More Stories By Rob Sobers
Rob Sobers is a designer, web developer, and technical marketing manager for Varonis, where he oversees the company's online marketing strategy. He writes a popular blog on software and security at accidentalhacker.com and is co-author of the book "Learn Ruby the Hard Way", which has been used by thousands of students to learn the Ruby programming language. Rob is a 12-year technology industry veteran and, prior to joining Varonis, held positions in software engineering, design, and professional services.
- Cloud People: A Who's Who of Cloud Computing
- Cloud Expo New York Speaker Profile: Dave Linthicum – Cloud Technology Partners
- Cloud Expo New York: Cloud Is Changing the Economics of Business
- Cloud Expo New York: Delivering Digital Marketing on the Cloud
- Cloud Expo New York: Deploying Hybrid Cloud for Performance and Uptime
- Big Data Isn’t About the Database, It’s About the Application
- BEA Updates WebLogic SOA Portal for Web 2.0 Era
- Cloudant to Exhibit at Cloud Expo & Big Data Expo New York
- Cloud Expo New York: Rethink IT and Reinvent Business with IBM SmartCloud
- How to Move Your Oracle Databases to Amazon EC2 Cloud
- The Accessibility of the Cloud
- Cloud Expo New York | Danger Ahead: Why File Sync Is NOT Endpoint Backup
- Cloud People: A Who's Who of Cloud Computing
- Cloud Expo New York: Best CIO Practices Shared from SHI’s Customers
- Cloud Expo New York Speaker Profile: Dave Linthicum – Cloud Technology Partners
- Examining the True Cost of Big Data
- Cloud Expo New York: Cloud Is Changing the Economics of Business
- Cloud Expo New York: How to Use Google Apps Script
- Cloud Computing Bootcamp at Cloud Expo New York
- Software Defined Networking – A Paradigm Shift
- Rackspace Hosting Named “Platinum Plus Sponsor” of Cloud Expo New York
- Cloud Expo New York: Why Big Data Is Really About Small Data
- Cloud Expo New York: Delivering Digital Marketing on the Cloud
- Cloud Expo New York: Deploying Hybrid Cloud for Performance and Uptime
- The i-Technology Right Stuff
- The Top 150 Players in Cloud Computing
- Who Are The All-Time Heroes of i-Technology?
- Where Are RIA Technologies Headed in 2008?
- Get the Message
- i-Technology Viewpoint: Is Web 2.0 the Global SOA?
- ESB Myth Busters: 10 Enterprise Service Bus Myths Debunked
- i-Technology Viewpoint: Thinking Outside the VC Box
- i-Technology Viewpoint: When to Leave Your First IT Job
- SOA Web Services Edge Conference Coverage on SYS-CON.TV
- SYS-CON.TV's "SOA Web Services" and "Enterprise Open Source" Programs To Air in December
- Five Reasons Why Web 2.0 Matters






















