Welcome!

Microservices Expo Authors: Stackify Blog, Elizabeth White, Liz McMillan, Pat Romanski, Jason Bloomberg

Blog Feed Post

Social Network Analysis at New Frontiers in Computing 2013

by Joseph Rickert This past Saturday, the New Frontiers in Computing Conference (NFIC 2013), held at Stanford University, explored the theme: Social Network Analysis: It’s Who You Know. The speakers were a well-chosen, eclectic lot who covered a remarkable array of issues in less than a full day. Ian Hersey, former CTO of Attensity spoke on Lessons from Large-Scale Social Analytics. Michael Wu, Chief scientist of Lithium Technologies, provided an introduction to social network analysis and very gamely conducted a live experiment building a social network of attendee tweets during the conference. Rong Yan, the Engineering Manager for Ads Relevance and Quality at Facebook spoke about machine learning insights. Zahan Malkani, an engineer at Facebook, presented “Dog”, the yet to be released social media programming language. Shivakumar Vaithyanathan, Chief Scientist for Text Analytics at the IBM Almaden Research Center that is built around IBM’s Annotation Query Language (AQL). Laura Jacob, a Factset engineer and president of the IEEE’s Society on the Social Implications of Technology spoke about “Context Collapse”, a fundamental cause for the damaging “oversharing” trap that so many Facebook and Twitter users fall into. Finally, John Rehling, Senior Research Scientist at Reputation.com, “cleaned up” with an alarming discussion of the mind boggling hazards we all face in just using the Internet. Although most of the talks were obviously enhanced versions of corporate presentations, there was nothing superficial about the day. Collectively, the presentations and panel discussions provided a comprehensive, multidimensional look at the technologies, issues and challenges associate with social networks. Most refreshingly, the day was mostly hype free — no beating the drum for big data or promoting unreasonable expectations for Hadoop.  The presenters all seemed to pretty much be in agreement about the current best practices in technology. Hadoop, for example, was characterized as being the place for massive amounts of persistent data, but not a suitable platform for ingesting social media data where low latency is of paramount importance. And, Rong Yan pointed out that although Facebook is a big Hadoop shop they do not use Map-Reduce for analyses that require status sharing among processors distributed across the cluster. R came up at various times during the discussions in a matter of fact way. Rong pointed out for example, that for data stored in Hadoop clusters Pig or Hive will typically be used to aggregate data at which point it is no longer big data. After that R, Matlab or SQL might be used for analysis. He indicated that most business questions can be answered with relatively small data sets. When it really is necessary to work with a large data set then the analysis is likely to be done in C++. At one point Shivakumar casually remarked that AQL syntax looks a lot like R. A technical highlight of the day was Michael Wu’s introduction to social network analysis (SNA). With the help of an open source plug-in to Excel he was able to start from first principles and work up to explaining some fairly sophisticated performance metrics for social network graphs such as eigenvector centrality. Basically, this is the notion of giving high scores to nodes that are connected to nodes that are themselves central within the network. (For a very nice explanation of this idea and pointers to the source papers have a look at the Plos paper by Gabrielle Lohmann et al.) Michael gave a remarkably clear presentation and although he did not use R he could have. For anyone with an interest in getting started with SNA I recommend the 2010 Social Network Analysis Labs in R written by McFarland, Messing and Nowak. The labs use functions from the igraph package and data from the NetData package to provide a challenging introductory SNA course. The first plot (from the 4th lab) shows a network graph of student interactions using the studentnets.S641 data set. This next plot shows the Eigenvector centrality score for each student. The most fascinating and distressing presentations and discussions happened in the section on Privacy Implications for SNA. Laura Jacob started things off here by providing some social theory background for the problem of inadvertently oversharing on social media sites. Frequently this sort of thing happens when the imagined audience for a tweet, message or photo turns out not to be the actual audience. This “context collapse” results from the tension between the individual’s attempt to establish some level of privacy and the social media site’s desire obtain information. Laura explained that social media sites know that if they put you a certain context you are more likely to share information that is appropriate for that context. However, unless you are really careful about the privacy settings the actual context might include a wider audience than intended. At some level, participating in social media is like continually reliving that part of your wedding day where you worked very hard to limit the conversation between your new in-laws at Table 1 and your Vegas party friends seated in Table 12. For more on the theory take a look at Laura’s suggested reading list of (Goffman 1959) and (Marwick 2010) In the final presentation of the day, John Rehling took the attendees through the “Spectrum of Social Distance”: self < younger self < family < friend < acquaintance < enemy; recounted a number of cases where reputations were tarnished and irrevocable damage done by people closer than family and then pointed out that in the future we can expect to live in a world where individually innocuous bits of information will be assembled to form damaging information. This very brief summary of the conference does not do justice to any of the presenters, but will end here with Ian Hersey’s list of ongoing challenges for SNA: The growth in the volume of data (10% increase per month) Data Quality Assurance Rich natural language processing in many languages across many domains The sparseness of geocoded data Veracity (There is lots of gaming going on in social media) Irony / sarcasm detection Finally, I'm betting that not long after Dog we will have “RDog”.

Read the original blog entry...

More Stories By David Smith

David Smith is Vice President of Marketing and Community at Revolution Analytics. He has a long history with the R and statistics communities. After graduating with a degree in Statistics from the University of Adelaide, South Australia, he spent four years researching statistical methodology at Lancaster University in the United Kingdom, where he also developed a number of packages for the S-PLUS statistical modeling environment. He continued his association with S-PLUS at Insightful (now TIBCO Spotfire) overseeing the product management of S-PLUS and other statistical and data mining products.<

David smith is the co-author (with Bill Venables) of the popular tutorial manual, An Introduction to R, and one of the originating developers of the ESS: Emacs Speaks Statistics project. Today, he leads marketing for REvolution R, supports R communities worldwide, and is responsible for the Revolutions blog. Prior to joining Revolution Analytics, he served as vice president of product management at Zynchros, Inc. Follow him on twitter at @RevoDavid

@MicroservicesExpo Stories
Docker is sweeping across startups and enterprises alike, changing the way we build and ship applications. It's the most prominent and widely known software container platform, and it's particularly useful for eliminating common challenges when collaborating on code (like the "it works on my machine" phenomenon that most devs know all too well). With Docker, you can run and manage apps side-by-side - in isolated containers - resulting in better compute density. It's something that many developer...
The “Digital Era” is forcing us to engage with new methods to build, operate and maintain applications. This transformation also implies an evolution to more and more intelligent applications to better engage with the customers, while creating significant market differentiators. In both cases, the cloud has become a key enabler to embrace this digital revolution. So, moving to the cloud is no longer the question; the new questions are HOW and WHEN. To make this equation even more complex, most ...
Your homes and cars can be automated and self-serviced. Why can't your storage? From simply asking questions to analyze and troubleshoot your infrastructure, to provisioning storage with snapshots, recovery and replication, your wildest sci-fi dream has come true. In his session at @DevOpsSummit at 20th Cloud Expo, Dan Florea, Director of Product Management at Tintri, provided a ChatOps demo where you can talk to your storage and manage it from anywhere, through Slack and similar services with...
Don’t go chasing waterfall … development, that is. According to a recent post by Madison Moore on Medium featuring insights from several software delivery industry leaders, waterfall is – while still popular – not the best way to win in the marketplace. With methodologies like Agile, DevOps and Continuous Delivery becoming ever more prominent over the past 15 years or so, waterfall is old news. Or, is it? Moore cites a recent study by Gartner: “According to Gartner’s IT Key Metrics Data report, ...
What's the role of an IT self-service portal when you get to continuous delivery and Infrastructure as Code? This general session showed how to create the continuous delivery culture and eight accelerators for leading the change. Don Demcsak is a DevOps and Cloud Native Modernization Principal for Dell EMC based out of New Jersey. He is a former, long time, Microsoft Most Valuable Professional, specializing in building and architecting Application Delivery Pipelines for hybrid legacy, and cloud ...
Many organizations are now looking to DevOps maturity models to gauge their DevOps adoption and compare their maturity to their peers. However, as enterprise organizations rush to adopt DevOps, moving past experimentation to embrace it at scale, they are in danger of falling into the trap that they have fallen into time and time again. Unfortunately, we've seen this movie before, and we know how it ends: badly.
"I focus on what we are calling CAST Highlight, which is our SaaS application portfolio analysis tool. It is an extremely lightweight tool that can integrate with pretty much any build process right now," explained Andrew Siegmund, Application Migration Specialist for CAST, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
"We view the cloud not as a specific technology but as a way of doing business and that way of doing business is transforming the way software, infrastructure and services are being delivered to business," explained Matthew Rosen, CEO and Director at Fusion, in this SYS-CON.tv interview at 18th Cloud Expo (http://www.CloudComputingExpo.com), held June 7-9 at the Javits Center in New York City, NY.
You often hear the two titles of "DevOps" and "Immutable Infrastructure" used independently. In his session at DevOps Summit, John Willis, Technical Evangelist for Docker, covered the union between the two topics and why this is important. He provided an overview of Immutable Infrastructure then showed how an Immutable Continuous Delivery pipeline can be applied as a best practice for "DevOps." He ended the session with some interesting case study examples.
In his session at Cloud Expo, Alan Winters, U.S. Head of Business Development at MobiDev, presented a success story of an entrepreneur who has both suffered through and benefited from offshore development across multiple businesses: The smart choice, or how to select the right offshore development partner Warning signs, or how to minimize chances of making the wrong choice Collaboration, or how to establish the most effective work processes Budget control, or how to maximize project result...
"DivvyCloud as a company set out to help customers automate solutions to the most common cloud problems," noted Jeremy Snyder, VP of Business Development at DivvyCloud, in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
Explosive growth in connected devices. Enormous amounts of data for collection and analysis. Critical use of data for split-second decision making and actionable information. All three are factors in making the Internet of Things a reality. Yet, any one factor would have an IT organization pondering its infrastructure strategy. How should your organization enhance its IT framework to enable an Internet of Things implementation? In his session at @ThingsExpo, James Kirkland, Red Hat's Chief Archi...
In his general session at 19th Cloud Expo, Manish Dixit, VP of Product and Engineering at Dice, discussed how Dice leverages data insights and tools to help both tech professionals and recruiters better understand how skills relate to each other and which skills are in high demand using interactive visualizations and salary indicator tools to maximize earning potential. Manish Dixit is VP of Product and Engineering at Dice. As the leader of the Product, Engineering and Data Sciences team at D...
Without a clear strategy for cost control and an architecture designed with cloud services in mind, costs and operational performance can quickly get out of control. To avoid multiple architectural redesigns requires extensive thought and planning. Boundary (now part of BMC) launched a new public-facing multi-tenant high resolution monitoring service on Amazon AWS two years ago, facing challenges and learning best practices in the early days of the new service.
In his keynote at 19th Cloud Expo, Sheng Liang, co-founder and CEO of Rancher Labs, discussed the technological advances and new business opportunities created by the rapid adoption of containers. With the success of Amazon Web Services (AWS) and various open source technologies used to build private clouds, cloud computing has become an essential component of IT strategy. However, users continue to face challenges in implementing clouds, as older technologies evolve and newer ones like Docker c...
We all know that end users experience the Internet primarily with mobile devices. From an app development perspective, we know that successfully responding to the needs of mobile customers depends on rapid DevOps – failing fast, in short, until the right solution evolves in your customers' relationship to your business. Whether you’re decomposing an SOA monolith, or developing a new application cloud natively, it’s not a question of using microservices – not doing so will be a path to eventual b...
We all know that end users experience the internet primarily with mobile devices. From an app development perspective, we know that successfully responding to the needs of mobile customers depends on rapid DevOps – failing fast, in short, until the right solution evolves in your customers' relationship to your business. Whether you’re decomposing an SOA monolith, or developing a new application cloud natively, it’s not a question of using microservices - not doing so will be a path to eventual ...
We all know that end users experience the internet primarily with mobile devices. From an app development perspective, we know that successfully responding to the needs of mobile customers depends on rapid DevOps – failing fast, in short, until the right solution evolves in your customers' relationship to your business. Whether you’re decomposing an SOA monolith, or developing a new application cloud natively, it’s not a question of using microservices - not doing so will be a path to eventual ...
All organizations that did not originate this moment have a pre-existing culture as well as legacy technology and processes that can be more or less amenable to DevOps implementation. That organizational culture is influenced by the personalities and management styles of Executive Management, the wider culture in which the organization is situated, and the personalities of key team members at all levels of the organization. This culture and entrenched interests usually throw a wrench in the work...
JetBlue Airways uses virtual environments to reduce software development costs, centralize performance testing, and create a climate for continuous integration and real-time monitoring of mobile applications. The next BriefingsDirect Voice of the Customer performance engineering case study discussion examines how JetBlue Airways in New York uses virtual environments to reduce software development costs, centralize performance testing, and create a climate for continuous integration and real-tim...