Welcome!

Microservices Expo Authors: Pat Romanski, Liz McMillan, Simon Hill, Madhavan Krishnan, VP, Cloud Solutions, Virtusa, John Rauser

Blog Feed Post

Social Network Analysis at New Frontiers in Computing 2013

by Joseph Rickert This past Saturday, the New Frontiers in Computing Conference (NFIC 2013), held at Stanford University, explored the theme: Social Network Analysis: It’s Who You Know. The speakers were a well-chosen, eclectic lot who covered a remarkable array of issues in less than a full day. Ian Hersey, former CTO of Attensity spoke on Lessons from Large-Scale Social Analytics. Michael Wu, Chief scientist of Lithium Technologies, provided an introduction to social network analysis and very gamely conducted a live experiment building a social network of attendee tweets during the conference. Rong Yan, the Engineering Manager for Ads Relevance and Quality at Facebook spoke about machine learning insights. Zahan Malkani, an engineer at Facebook, presented “Dog”, the yet to be released social media programming language. Shivakumar Vaithyanathan, Chief Scientist for Text Analytics at the IBM Almaden Research Center that is built around IBM’s Annotation Query Language (AQL). Laura Jacob, a Factset engineer and president of the IEEE’s Society on the Social Implications of Technology spoke about “Context Collapse”, a fundamental cause for the damaging “oversharing” trap that so many Facebook and Twitter users fall into. Finally, John Rehling, Senior Research Scientist at Reputation.com, “cleaned up” with an alarming discussion of the mind boggling hazards we all face in just using the Internet. Although most of the talks were obviously enhanced versions of corporate presentations, there was nothing superficial about the day. Collectively, the presentations and panel discussions provided a comprehensive, multidimensional look at the technologies, issues and challenges associate with social networks. Most refreshingly, the day was mostly hype free — no beating the drum for big data or promoting unreasonable expectations for Hadoop.  The presenters all seemed to pretty much be in agreement about the current best practices in technology. Hadoop, for example, was characterized as being the place for massive amounts of persistent data, but not a suitable platform for ingesting social media data where low latency is of paramount importance. And, Rong Yan pointed out that although Facebook is a big Hadoop shop they do not use Map-Reduce for analyses that require status sharing among processors distributed across the cluster. R came up at various times during the discussions in a matter of fact way. Rong pointed out for example, that for data stored in Hadoop clusters Pig or Hive will typically be used to aggregate data at which point it is no longer big data. After that R, Matlab or SQL might be used for analysis. He indicated that most business questions can be answered with relatively small data sets. When it really is necessary to work with a large data set then the analysis is likely to be done in C++. At one point Shivakumar casually remarked that AQL syntax looks a lot like R. A technical highlight of the day was Michael Wu’s introduction to social network analysis (SNA). With the help of an open source plug-in to Excel he was able to start from first principles and work up to explaining some fairly sophisticated performance metrics for social network graphs such as eigenvector centrality. Basically, this is the notion of giving high scores to nodes that are connected to nodes that are themselves central within the network. (For a very nice explanation of this idea and pointers to the source papers have a look at the Plos paper by Gabrielle Lohmann et al.) Michael gave a remarkably clear presentation and although he did not use R he could have. For anyone with an interest in getting started with SNA I recommend the 2010 Social Network Analysis Labs in R written by McFarland, Messing and Nowak. The labs use functions from the igraph package and data from the NetData package to provide a challenging introductory SNA course. The first plot (from the 4th lab) shows a network graph of student interactions using the studentnets.S641 data set. This next plot shows the Eigenvector centrality score for each student. The most fascinating and distressing presentations and discussions happened in the section on Privacy Implications for SNA. Laura Jacob started things off here by providing some social theory background for the problem of inadvertently oversharing on social media sites. Frequently this sort of thing happens when the imagined audience for a tweet, message or photo turns out not to be the actual audience. This “context collapse” results from the tension between the individual’s attempt to establish some level of privacy and the social media site’s desire obtain information. Laura explained that social media sites know that if they put you a certain context you are more likely to share information that is appropriate for that context. However, unless you are really careful about the privacy settings the actual context might include a wider audience than intended. At some level, participating in social media is like continually reliving that part of your wedding day where you worked very hard to limit the conversation between your new in-laws at Table 1 and your Vegas party friends seated in Table 12. For more on the theory take a look at Laura’s suggested reading list of (Goffman 1959) and (Marwick 2010) In the final presentation of the day, John Rehling took the attendees through the “Spectrum of Social Distance”: self < younger self < family < friend < acquaintance < enemy; recounted a number of cases where reputations were tarnished and irrevocable damage done by people closer than family and then pointed out that in the future we can expect to live in a world where individually innocuous bits of information will be assembled to form damaging information. This very brief summary of the conference does not do justice to any of the presenters, but will end here with Ian Hersey’s list of ongoing challenges for SNA: The growth in the volume of data (10% increase per month) Data Quality Assurance Rich natural language processing in many languages across many domains The sparseness of geocoded data Veracity (There is lots of gaming going on in social media) Irony / sarcasm detection Finally, I'm betting that not long after Dog we will have “RDog”.

Read the original blog entry...

More Stories By David Smith

David Smith is Vice President of Marketing and Community at Revolution Analytics. He has a long history with the R and statistics communities. After graduating with a degree in Statistics from the University of Adelaide, South Australia, he spent four years researching statistical methodology at Lancaster University in the United Kingdom, where he also developed a number of packages for the S-PLUS statistical modeling environment. He continued his association with S-PLUS at Insightful (now TIBCO Spotfire) overseeing the product management of S-PLUS and other statistical and data mining products.<

David smith is the co-author (with Bill Venables) of the popular tutorial manual, An Introduction to R, and one of the originating developers of the ESS: Emacs Speaks Statistics project. Today, he leads marketing for REvolution R, supports R communities worldwide, and is responsible for the Revolutions blog. Prior to joining Revolution Analytics, he served as vice president of product management at Zynchros, Inc. Follow him on twitter at @RevoDavid

@MicroservicesExpo Stories
Agile has finally jumped the technology shark, expanding outside the software world. Enterprises are now increasingly adopting Agile practices across their organizations in order to successfully navigate the disruptive waters that threaten to drown them. In our quest for establishing change as a core competency in our organizations, this business-centric notion of Agile is an essential component of Agile Digital Transformation. In the years since the publication of the Agile Manifesto, the conn...
While some developers care passionately about how data centers and clouds are architected, for most, it is only the end result that matters. To the majority of companies, technology exists to solve a business problem, and only delivers value when it is solving that problem. 2017 brings the mainstream adoption of containers for production workloads. In his session at 21st Cloud Expo, Ben McCormack, VP of Operations at Evernote, discussed how data centers of the future will be managed, how the p...
Cavirin Systems has just announced C2, a SaaS offering designed to bring continuous security assessment and remediation to hybrid environments, containers, and data centers. Cavirin C2 is deployed within Amazon Web Services (AWS) and features a flexible licensing model for easy scalability and clear pay-as-you-go pricing. Although native to AWS, it also supports assessment and remediation of virtual or container instances within Microsoft Azure, Google Cloud Platform (GCP), or on-premise. By dr...
The cloud revolution in enterprises has very clearly crossed the phase of proof-of-concepts into a truly mainstream adoption. One of most popular enterprise-wide initiatives currently going on are “cloud migration” programs of some kind or another. Finding business value for these programs is not hard to fathom – they include hyperelasticity in infrastructure consumption, subscription based models, and agility derived from rapid speed of deployment of applications. These factors will continue to...
While we understand Agile as a means to accelerate innovation, manage uncertainty and cope with ambiguity, many are inclined to think that it conflicts with the objectives of traditional engineering projects, such as building a highway, skyscraper or power plant. These are plan-driven and predictive projects that seek to avoid any uncertainty. This type of thinking, however, is short-sighted. Agile approaches are valuable in controlling uncertainty because they constrain the complexity that ste...
identify the sources of event storms and performance anomalies will require automated, real-time root-cause analysis. I think Enterprise Management Associates said it well: “The data and metrics collected at instrumentation points across the application ecosystem are essential to performance monitoring and root cause analysis. However, analytics capable of transforming data and metrics into an application-focused report or dashboards are what separates actual application monitoring from relat...
"This all sounds great. But it's just not realistic." This is what a group of five senior IT executives told me during a workshop I held not long ago. We were working through an exercise on the organizational characteristics necessary to successfully execute a digital transformation, and the group was doing their ‘readout.' The executives loved everything we discussed and agreed that if such an environment existed, it would make transformation much easier. They just didn't believe it was reali...
"Codigm is based on the cloud and we are here to explore marketing opportunities in America. Our mission is to make an ecosystem of the SW environment that anyone can understand, learn, teach, and develop the SW on the cloud," explained Sung Tae Ryu, CEO of Codigm, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
"We're developing a software that is based on the cloud environment and we are providing those services to corporations and the general public," explained Seungmin Kim, CEO/CTO of SM Systems Inc., in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
Many enterprise and government IT organizations are realizing the benefits of cloud computing by extending IT delivery and management processes across private and public cloud services. But they are often challenged with balancing the need for centralized cloud governance without stifling user-driven innovation. This strategy requires an approach that fundamentally reshapes how IT is delivered today, shifting the focus from infrastructure to services aggregation, and mixing and matching the bes...
DevOps promotes continuous improvement through a culture of collaboration. But in real terms, how do you: Integrate activities across diverse teams and services? Make objective decisions with system-wide visibility? Use feedback loops to enable learning and improvement? With technology insights and real-world examples, in his general session at @DevOpsSummit, at 21st Cloud Expo, Andi Mann, Chief Technology Advocate at Splunk, explored how leading organizations use data-driven DevOps to close th...
"CA has been doing a lot of things in the area of DevOps. Now we have a complete set of tool sets in order to enable customers to go all the way from planning to development to testing down to release into the operations," explained Aruna Ravichandran, Vice President of Global Marketing and Strategy at CA Technologies, in this SYS-CON.tv interview at DevOps Summit at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
We just came off of a review of a product that handles both containers and virtual machines in the same interface. Under the covers, implementation of containers defaults to LXC, though recently Docker support was added. When reading online, or searching for information, increasingly we see “Container Management” products listed as competitors to Docker, when in reality things like Rocket, LXC/LXD, and Virtualization are Dockers competitors. After doing some looking around, we have decided tha...
The nature of test environments is inherently temporary—you set up an environment, run through an automated test suite, and then tear down the environment. If you can reduce the cycle time for this process down to hours or minutes, then you may be able to cut your test environment budgets considerably. The impact of cloud adoption on test environments is a valuable advancement in both cost savings and agility. The on-demand model takes advantage of public cloud APIs requiring only payment for t...
DevOps teams have more on their plate than ever. As infrastructure needs grow, so does the time required to ensure that everything's running smoothly. This makes automation crucial - especially in the server and network monitoring world. Server monitoring tools can save teams time by automating server management and providing real-time performance updates. As budgets reset for the New Year, there is no better time to implement a new server monitoring tool (or re-evaluate your current solution)....
High-velocity engineering teams are applying not only continuous delivery processes, but also lessons in experimentation from established leaders like Amazon, Netflix, and Facebook. These companies have made experimentation a foundation for their release processes, allowing them to try out major feature releases and redesigns within smaller groups before making them broadly available. In his session at 21st Cloud Expo, Brian Lucas, Senior Staff Engineer at Optimizely, discussed how by using ne...
The benefits of automation are well documented; it increases productivity, cuts cost and minimizes errors. It eliminates repetitive manual tasks, freeing us up to be more innovative. By that logic, surely, we should automate everything possible, right? So, is attempting to automate everything a sensible - even feasible - goal? In a word: no. Consider this your short guide as to what to automate and what not to automate.
"We are an integrator of carrier ethernet and bandwidth to get people to connect to the cloud, to the SaaS providers, and the IaaS providers all on ethernet," explained Paul Mako, CEO & CTO of Massive Networks, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
From our perspective as consumers, perhaps the best thing about digital transformation is how consumerization is making technology so much easier to use. Sure, our television remote controls still have too many buttons, and I have yet to figure out the digital display in my Honda, but all in all, tech is getting easier for everybody. Within companies – even very large ones – the consumerization of technology is gradually taking hold as well. There are now simple mobile apps for a wide range of ...
"I focus on what we are calling CAST Highlight, which is our SaaS application portfolio analysis tool. It is an extremely lightweight tool that can integrate with pretty much any build process right now," explained Andrew Siegmund, Application Migration Specialist for CAST, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.