Welcome!

Microservices Expo Authors: Anders Wallgren, Liz McMillan, Pat Romanski, Martin Etmajer, Elizabeth White

Blog Feed Post

Social Network Analysis at New Frontiers in Computing 2013

by Joseph Rickert This past Saturday, the New Frontiers in Computing Conference (NFIC 2013), held at Stanford University, explored the theme: Social Network Analysis: It’s Who You Know. The speakers were a well-chosen, eclectic lot who covered a remarkable array of issues in less than a full day. Ian Hersey, former CTO of Attensity spoke on Lessons from Large-Scale Social Analytics. Michael Wu, Chief scientist of Lithium Technologies, provided an introduction to social network analysis and very gamely conducted a live experiment building a social network of attendee tweets during the conference. Rong Yan, the Engineering Manager for Ads Relevance and Quality at Facebook spoke about machine learning insights. Zahan Malkani, an engineer at Facebook, presented “Dog”, the yet to be released social media programming language. Shivakumar Vaithyanathan, Chief Scientist for Text Analytics at the IBM Almaden Research Center that is built around IBM’s Annotation Query Language (AQL). Laura Jacob, a Factset engineer and president of the IEEE’s Society on the Social Implications of Technology spoke about “Context Collapse”, a fundamental cause for the damaging “oversharing” trap that so many Facebook and Twitter users fall into. Finally, John Rehling, Senior Research Scientist at Reputation.com, “cleaned up” with an alarming discussion of the mind boggling hazards we all face in just using the Internet. Although most of the talks were obviously enhanced versions of corporate presentations, there was nothing superficial about the day. Collectively, the presentations and panel discussions provided a comprehensive, multidimensional look at the technologies, issues and challenges associate with social networks. Most refreshingly, the day was mostly hype free — no beating the drum for big data or promoting unreasonable expectations for Hadoop.  The presenters all seemed to pretty much be in agreement about the current best practices in technology. Hadoop, for example, was characterized as being the place for massive amounts of persistent data, but not a suitable platform for ingesting social media data where low latency is of paramount importance. And, Rong Yan pointed out that although Facebook is a big Hadoop shop they do not use Map-Reduce for analyses that require status sharing among processors distributed across the cluster. R came up at various times during the discussions in a matter of fact way. Rong pointed out for example, that for data stored in Hadoop clusters Pig or Hive will typically be used to aggregate data at which point it is no longer big data. After that R, Matlab or SQL might be used for analysis. He indicated that most business questions can be answered with relatively small data sets. When it really is necessary to work with a large data set then the analysis is likely to be done in C++. At one point Shivakumar casually remarked that AQL syntax looks a lot like R. A technical highlight of the day was Michael Wu’s introduction to social network analysis (SNA). With the help of an open source plug-in to Excel he was able to start from first principles and work up to explaining some fairly sophisticated performance metrics for social network graphs such as eigenvector centrality. Basically, this is the notion of giving high scores to nodes that are connected to nodes that are themselves central within the network. (For a very nice explanation of this idea and pointers to the source papers have a look at the Plos paper by Gabrielle Lohmann et al.) Michael gave a remarkably clear presentation and although he did not use R he could have. For anyone with an interest in getting started with SNA I recommend the 2010 Social Network Analysis Labs in R written by McFarland, Messing and Nowak. The labs use functions from the igraph package and data from the NetData package to provide a challenging introductory SNA course. The first plot (from the 4th lab) shows a network graph of student interactions using the studentnets.S641 data set. This next plot shows the Eigenvector centrality score for each student. The most fascinating and distressing presentations and discussions happened in the section on Privacy Implications for SNA. Laura Jacob started things off here by providing some social theory background for the problem of inadvertently oversharing on social media sites. Frequently this sort of thing happens when the imagined audience for a tweet, message or photo turns out not to be the actual audience. This “context collapse” results from the tension between the individual’s attempt to establish some level of privacy and the social media site’s desire obtain information. Laura explained that social media sites know that if they put you a certain context you are more likely to share information that is appropriate for that context. However, unless you are really careful about the privacy settings the actual context might include a wider audience than intended. At some level, participating in social media is like continually reliving that part of your wedding day where you worked very hard to limit the conversation between your new in-laws at Table 1 and your Vegas party friends seated in Table 12. For more on the theory take a look at Laura’s suggested reading list of (Goffman 1959) and (Marwick 2010) In the final presentation of the day, John Rehling took the attendees through the “Spectrum of Social Distance”: self < younger self < family < friend < acquaintance < enemy; recounted a number of cases where reputations were tarnished and irrevocable damage done by people closer than family and then pointed out that in the future we can expect to live in a world where individually innocuous bits of information will be assembled to form damaging information. This very brief summary of the conference does not do justice to any of the presenters, but will end here with Ian Hersey’s list of ongoing challenges for SNA: The growth in the volume of data (10% increase per month) Data Quality Assurance Rich natural language processing in many languages across many domains The sparseness of geocoded data Veracity (There is lots of gaming going on in social media) Irony / sarcasm detection Finally, I'm betting that not long after Dog we will have “RDog”.

Read the original blog entry...

More Stories By David Smith

David Smith is Vice President of Marketing and Community at Revolution Analytics. He has a long history with the R and statistics communities. After graduating with a degree in Statistics from the University of Adelaide, South Australia, he spent four years researching statistical methodology at Lancaster University in the United Kingdom, where he also developed a number of packages for the S-PLUS statistical modeling environment. He continued his association with S-PLUS at Insightful (now TIBCO Spotfire) overseeing the product management of S-PLUS and other statistical and data mining products.<

David smith is the co-author (with Bill Venables) of the popular tutorial manual, An Introduction to R, and one of the originating developers of the ESS: Emacs Speaks Statistics project. Today, he leads marketing for REvolution R, supports R communities worldwide, and is responsible for the Revolutions blog. Prior to joining Revolution Analytics, he served as vice president of product management at Zynchros, Inc. Follow him on twitter at @RevoDavid

@MicroservicesExpo Stories
SYS-CON Events announced today that Alert Logic, Inc., the leading provider of Security-as-a-Service solutions for the cloud, will exhibit at SYS-CON's 18th International Cloud Expo®, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. Alert Logic, Inc., provides Security-as-a-Service for on-premises, cloud, and hybrid infrastructures, delivering deep security insight and continuous protection for customers at a lower cost than traditional security solutions. Ful...
SYS-CON Events announced today that Interoute, owner-operator of one of Europe's largest networks and a global cloud services platform, has been named “Bronze Sponsor” of SYS-CON's 18th Cloud Expo, which will take place on June 7-9, 2015 at the Javits Center in New York, New York. Interoute is the owner-operator of one of Europe's largest networks and a global cloud services platform which encompasses 12 data centers, 14 virtual data centers and 31 colocation centers, with connections to 195 ad...
The (re?)emergence of Microservices was especially prominent in this week’s news. What are they good for? do they make sense for your application? should you take the plunge? and what do Microservices mean for your DevOps and Continuous Delivery efforts? Continue reading for more on Microservices, containers, DevOps culture, and more top news from the past week. As always, stay tuned to all the news coming from@ElectricCloud on DevOps and Continuous Delivery throughout the week and retweet/favo...
Sensors and effectors of IoT are solving problems in new ways, but small businesses have been slow to join the quantified world. They’ll need information from IoT using applications as varied as the businesses themselves. In his session at @ThingsExpo, Roger Meike, Distinguished Engineer, Director of Technology Innovation at Intuit, showed how IoT manufacturers can use open standards, public APIs and custom apps to enable the Quantified Small Business. He used a Raspberry Pi to connect sensors...
In a previous article, I demonstrated how to effectively and efficiently install the Dynatrace Application Monitoring solution using Ansible. In this post, I am going to explain how to achieve the same results using Chef with our official dynatrace cookbook available on GitHub and on the Chef Supermarket. In the following hands-on tutorial, we’ll also apply what we see as good practice on working with and extending our deployment automation blueprints to suit your needs.
SYS-CON Events announced today that Avere Systems, a leading provider of enterprise storage for the hybrid cloud, will exhibit at SYS-CON's 18th International Cloud Expo®, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. Avere delivers a more modern architectural approach to storage that doesn’t require the overprovisioning of storage capacity to achieve performance, overspending on expensive storage media for inactive data or the overbuilding of data centers ...
SYS-CON Events announced today that Commvault, a global leader in enterprise data protection and information management, has been named “Bronze Sponsor” of SYS-CON's 18th International Cloud Expo, which will take place on June 7–9, 2016, at the Javits Center in New York City, NY, and the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Commvault is a leading provider of data protection and information management...
SYS-CON Events announced today that VAI, a leading ERP software provider, will exhibit at SYS-CON's 18th International Cloud Expo®, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. VAI (Vormittag Associates, Inc.) is a leading independent mid-market ERP software developer renowned for its flexible solutions and ability to automate critical business functions for the distribution, manufacturing, specialty retail and service sectors. An IBM Premier Business Part...
With an estimated 50 billion devices connected to the Internet by 2020, several industries will begin to expand their capabilities for retaining end point data at the edge to better utilize the range of data types and sheer volume of M2M data generated by the Internet of Things. In his session at @ThingsExpo, Don DeLoach, CEO and President of Infobright, will discuss the infrastructures businesses will need to implement to handle this explosion of data by providing specific use cases for filte...
In most cases, it is convenient to have some human interaction with a web (micro-)service, no matter how small it is. A traditional approach would be to create an HTTP interface, where user requests will be dispatched and HTML/CSS pages must be served. This approach is indeed very traditional for a web site, but not really convenient for a web service, which is not intended to be good looking, 24x7 up and running and UX-optimized. Instead, talking to a web service in a chat-bot mode would be muc...
More and more companies are looking to microservices as an architectural pattern for breaking apart applications into more manageable pieces so that agile teams can deliver new features quicker and more effectively. What this pattern has done more than anything to date is spark organizational transformations, setting the foundation for future application development. In practice, however, there are a number of considerations to make that go beyond simply “build, ship, and run,” which changes ho...
SYS-CON Events announced today that AppNeta, the leader in performance insight for business-critical web applications, will exhibit and present at SYS-CON's @DevOpsSummit at Cloud Expo New York, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. AppNeta is the only application performance monitoring (APM) company to provide solutions for all applications – applications you develop internally, business-critical SaaS applications you use and the networks that deli...
Cognitive Computing is becoming the foundation for a new generation of solutions that have the potential to transform business. Unlike traditional approaches to building solutions, a cognitive computing approach allows the data to help determine the way applications are designed. This contrasts with conventional software development that begins with defining logic based on the current way a business operates. In her session at 18th Cloud Expo, Judith S. Hurwitz, President and CEO of Hurwitz & ...
How is your DevOps transformation coming along? How do you measure Agility? Reliability? Efficiency? Quality? Success?! How do you optimize your processes? This morning on #c9d9 we talked about some of the metrics that matter for the different stakeholders throughout the software delivery pipeline. Our panelists shared their best practices.
CIOs and those charged with running IT Operations are challenged to deliver secure, audited, and reliable compute environments for the applications and data for the business. Behind the scenes these tasks are often accomplished by following onerous time-consuming processes and often the management of these environments and processes will be outsourced to multiple IT service providers. In addition, the division of work is often siloed into traditional "towers" that are not well integrated for cro...
If we look at slow, traditional IT and jump to the conclusion that just because we found its issues intractable before, that necessarily means we will again, then it’s time for a rethink. As a matter of fact, the world of IT has changed over the last ten years or so. We’ve been experiencing unprecedented innovation across the board – innovation in technology as well as in how people organize and accomplish tasks. Let’s take a look at three differences between today’s modern, digital context...
The principles behind DevOps are not new - for decades people have been automating system administration and decreasing the time to deploy apps and perform other management tasks. However, only recently did we see the tools and the will necessary to share the benefits and power of automation with a wider circle of people. In his session at DevOps Summit, Bernard Sanders, Chief Technology Officer at CloudBolt Software, explored the latest tools including Puppet, Chef, Docker, and CMPs needed to...
Let’s face it, embracing new storage technologies, capabilities and upgrading to new hardware often adds complexity and increases costs. In his session at 18th Cloud Expo, Seth Oxenhorn, Vice President of Business Development & Alliances at FalconStor, will discuss how a truly heterogeneous software-defined storage approach can add value to legacy platforms and heterogeneous environments. The result reduces complexity, significantly lowers cost, and provides IT organizations with improved effi...
The cloud promises new levels of agility and cost-savings for Big Data, data warehousing and analytics. But it’s challenging to understand all the options – from IaaS and PaaS to newer services like HaaS (Hadoop as a Service) and BDaaS (Big Data as a Service). In her session at @BigDataExpo at @ThingsExpo, Hannah Smalltree, a director at Cazena, will provide an educational overview of emerging “as-a-service” options for Big Data in the cloud. This is critical background for IT and data profes...
SYS-CON Events announced today that Men & Mice, the leading global provider of DNS, DHCP and IP address management overlay solutions, will exhibit at SYS-CON's 18th International Cloud Expo®, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. The Men & Mice Suite overlay solution is already known for its powerful application in heterogeneous operating environments, enabling enterprises to scale without fuss. Building on a solid range of diverse platform support,...