Click here to close now.



Welcome!

Microservices Expo Authors: Elizabeth White, Liz McMillan, Carmen Gonzalez, Pat Romanski, Yeshim Deniz

Related Topics: Microservices Expo

Microservices Expo: Article

In at the Deep End: Training for High Performance Distributed Systems

Six best practices for training new engineers

Here at Logentries we have a simple philosophy when it comes to hiring: hire the best people we can find and let them jump in at the deep end. That is how we like to learn. Smart people like to go deep and then find out what they don’t know as they work through some real world problems. And, our job is to give them the mentoring and support they need to overcome the blockers quickly and continue the learning process. We ensure they come to us with a great computer science or mathematical background and we take it from there.

Logentries TrainingWhen we thought about designing a training program we built it around the following set of problems:

1. CAP Theorem

Learn to accept that consistency, availability and partition tolerance cannot be provided simultaneously. Decide what is important for your customers and make the best trade-offs you can. But, what you build needs to be understandable. Raft provides a consensus algorithm whose aim is to be understandable, or at least to be easier to comprehend than Paxos or Paxos-influenced algorithms. It’s a good place to start to understand the distributed systems challenge. So we start with the theory then the next step we get practical…

2. Embrace Hardware Failures

Logentries runs is an Amazon EC2-based service – an environment where failures in parts of the system are routine. Netflix has taken this approach to its natural conclusion. If you expect failure and know that when it happens it’s going to happen at the worst possible time; why not plan for it to happen all time. So they built Chaos Monkey, a tool that randomly generates various kinds of failures in your environment. You can run this to schedule so that failures happen when you are closely monitoring things. By combining Raft + Chaos Monkey – you get to understand what happens when things go wrong and how to build fault tolerant systems that can deal with any failure.

3. Embrace the Asynchronous

Blocking calls will kill performance and may impact availability. The slowest system in a chain will cause a backlog on the other nodes. Asynchronous design allows you to avoid the cascade effects that can occur when one node in the system brings down other nodes. It is the best way to build fault tolerant systems. So we start by building a system that relies on synchronous calls, show what happens under extreme load or when failures occur and then evolve this to work asynchronously.

4. Mechanical Sympathy

The LMAX Disruptor team understands that to get every last drop of performance out of your system you sometimes need to know every last detail of the hardware environment you are running on. They call it Mechanical Sympathy. They took what they knew about modern day processor architectures (caches, memory barriers, compare and swap etc.) and turned it into a software solution for high performance inter-thread messaging, which leads to massive performance improvements when used correctly.  So next we show our graduates what happens when you move from queue-based interactions to using Disruptor.

5. Horizontal over Vertical Scaling

Vertical scaling can give you quick wins in performance but when it comes to building a business like Logentries, where subscriber growth outpaces technology, the only way to design systems to scale with your success is to scale them horizontally. Don’t wait for technology to set limits on your growth. Also it’s better to scale on commodity hardware or, in our case, to scale on the EC2 servers that give you the best price-performance rather than rely on the latest premium-priced bells and whistles. When it comes to practical side of scaling we focus on splitting out the load but we plan for multiple axes of scale. Scaling out horizontally gives us the first split, where transactions are split over nodes. But we also ensure we have options for scaling by partitioning by types of transactions (log queries, API calls, live tail requests etc.), and by customer etc.

6. Talent Borrows, Genius Steals

We work only ten minutes away from where Oscar Wilde was born (Merrion Square, Dublin), so we do like to keep his words in mind. There are so many great distributed technologies around today – and they are getting better all the time. We have been inspired by Kafka, Hadoop, etcd, BDAS etc. But we like to control our own destiny and there are some very good reasons why we have built our platform from the ground up to do what it does in the best way it can. Turning logs into business insights isn’t easy – so we do like to stand on the shoulders of some distributed systems giants. So the final steps in our training program is to evaluate other distributed systems in the context of what we have learnt in the previous steps. Learn how Kafka implements mechanical sympathy when maximizing disk I/O. Understand Raft in the context of etcd. Understand the challenges on scaling analytics in the context of Hadoop.

Over the next few months – as we put our training program into practice we hope to have some of graduates blog about their experiences. Stay tuned!

More Stories By Trevor Parsons

Trevor Parsons is Chief Scientist and Co-founder of Logentries. Trevor has over 10 years experience in enterprise software and, in particular, has specialized in developing enterprise monitoring and performance tools for distributed systems. He is also a research fellow at the Performance Engineering Lab Research Group and was formerly a Scientist at the IBM Center for Advanced Studies. Trevor holds a PhD from University College Dublin, Ireland.

@MicroservicesExpo Stories
In the world of DevOps there are ‘known good practices’ – aka ‘patterns’ – and ‘known bad practices’ – aka ‘anti-patterns.' Many of these patterns and anti-patterns have been developed from real world experience, especially by the early adopters of DevOps theory; but many are more feasible in theory than in practice, especially for more recent entrants to the DevOps scene. In this power panel at @DevOpsSummit at 18th Cloud Expo, moderated by DevOps Conference Chair Andi Mann, panelists discusse...
When people aren’t talking about VMs and containers, they’re talking about serverless architecture. Serverless is about no maintenance. It means you are not worried about low-level infrastructural and operational details. An event-driven serverless platform is a great use case for IoT. In his session at @ThingsExpo, Animesh Singh, an STSM and Lead for IBM Cloud Platform and Infrastructure, will detail how to build a distributed serverless, polyglot, microservices framework using open source tec...
Cloud Expo, Inc. has announced today that Andi Mann returns to 'DevOps at Cloud Expo 2016' as Conference Chair The @DevOpsSummit at Cloud Expo will take place on November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. "DevOps is set to be one of the most profound disruptions to hit IT in decades," said Andi Mann. "It is a natural extension of cloud computing, and I have seen both firsthand and in independent research the fantastic results DevOps delivers. So I am excited t...
More and more companies are looking to microservices as an architectural pattern for breaking apart applications into more manageable pieces so that agile teams can deliver new features quicker and more effectively. What this pattern has done more than anything to date is spark organizational transformations, setting the foundation for future application development. In practice, however, there are a number of considerations to make that go beyond simply “build, ship, and run,” which changes ho...
Gartner is now treating algorithms like they are some kind of innovative addition to the modern digital discussion. Presumably the brilliant minds there have some novel insight into algorithms and, yes, the Algorithm Economy that CIOs should sit up and take notice of. Not only are algorithms nothing new, but much of what Gartner is saying about them is obvious. The bigger picture here is that software continues to improve, and enterprises are becoming increasingly software-driven, in part bec...
The Internet of Things is clearly many things: data collection and analytics, wearables, Smart Grids and Smart Cities, the Industrial Internet, and more. Cool platforms like Arduino, Raspberry Pi, Intel's Galileo and Edison, and a diverse world of sensors are making the IoT a great toy box for developers in all these areas. In this Power Panel at @ThingsExpo, moderated by Conference Chair Roger Strukhoff, panelists discussed what things are the most important, which will have the most profound...
The Internet of Things (IoT) is growing rapidly by extending current technologies, products and networks. By 2020, Cisco estimates there will be 50 billion connected devices. Gartner has forecast revenues of over $300 billion, just to IoT suppliers. Now is the time to figure out how you’ll make money – not just create innovative products. With hundreds of new products and companies jumping into the IoT fray every month, there’s no shortage of innovation. Despite this, McKinsey/VisionMobile data...
NHK, Japan Broadcasting, will feature the upcoming @ThingsExpo Silicon Valley in a special 'Internet of Things' and smart technology documentary that will be filmed on the expo floor between November 3 to 5, 2015, in Santa Clara. NHK is the sole public TV network in Japan equivalent to the BBC in the UK and the largest in Asia with many award-winning science and technology programs. Japanese TV is producing a documentary about IoT and Smart technology and will be covering @ThingsExpo Silicon Val...
SYS-CON Events announced today that Men & Mice, the leading global provider of DNS, DHCP and IP address management overlay solutions, will exhibit at SYS-CON's 18th International Cloud Expo®, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. The Men & Mice Suite overlay solution is already known for its powerful application in heterogeneous operating environments, enabling enterprises to scale without fuss. Building on a solid range of diverse platform support,...
Internet of @ThingsExpo, taking place June 7-9, 2016 at Javits Center, New York City and Nov 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA, is co-located with the 18th International @CloudExpo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world and ThingsExpo New York Call for Papers is now open.
SYS-CON Events announced today that Catchpoint Systems, Inc., a provider of innovative web and infrastructure monitoring solutions, has been named “Silver Sponsor” of SYS-CON's DevOps Summit at 18th Cloud Expo New York, which will take place June 7-9, 2016, at the Javits Center in New York City, NY. Catchpoint is a leading Digital Performance Analytics company that provides unparalleled insight into customer-critical services to help consistently deliver an amazing customer experience. Designed...
@DevOpsSummit taking place June 7-9, 2016 at Javits Center, New York City, and Nov 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA, is co-located with the 18th International @CloudExpo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world.
Cloud Expo, Inc. has announced today that Andi Mann returns to 'DevOps at Cloud Expo 2016' as Conference Chair The @DevOpsSummit at Cloud Expo will take place on June 7-9, 2016, at the Javits Center in New York City, New York. "DevOps is set to be one of the most profound disruptions to hit IT in decades," said Andi Mann. "It is a natural extension of cloud computing, and I have seen both firsthand and in independent research the fantastic results DevOps delivers. So I am excited to help the g...
Korean Broadcasting System (KBS) will feature the upcoming 18th Cloud Expo | @ThingsExpo in a New York news documentary about the "New IT for the Future." The documentary will cover how big companies are transmitting or adopting the new IT for the future and will be filmed on the expo floor between June 7-June 9, 2016, at the Javits Center in New York City, New York. KBS has long been a leader in the development of the broadcasting culture of Korea. As the key public service broadcaster of Korea...
SYS-CON Events announced today that Addteq will exhibit at SYS-CON's @DevOpsSummit at Cloud Expo New York, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. Addteq is one of the top 10 Platinum Atlassian Experts who specialize in DevOps, custom and continuous integration, automation, plugin development, and consulting for midsize and global firms. Addteq firmly believes that automation is essential for successful software releases. Addteq centers its products a...
In the rush to compete in the digital age, a successful digital transformation is essential, but many organizations are setting themselves up for failure. There’s a common misconception that the process is just about technology, but it’s not. It’s about your business. It shouldn’t be treated as an isolated IT project; it should be driven by business needs with the committed involvement of a range of stakeholders.
SYS-CON Events announced today that FalconStor Software® Inc., a 15-year innovator of software-defined storage solutions, will exhibit at SYS-CON's 18th International Cloud Expo®, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. FalconStor Software®, Inc. (NASDAQ: FALC) is a leading software-defined storage company offering a converged, hardware-agnostic, software-defined storage and data services platform. Its flagship solution FreeStor®, utilizes a horizonta...
With major technology companies and startups seriously embracing IoT strategies, now is the perfect time to attend @ThingsExpo 2016 in New York and Silicon Valley. Learn what is going on, contribute to the discussions, and ensure that your enterprise is as "IoT-Ready" as it can be! Internet of @ThingsExpo, taking place Nov 3-5, 2015, at the Santa Clara Convention Center in Santa Clara, CA, is co-located with 17th Cloud Expo and will feature technical sessions from a rock star conference faculty ...
SYS-CON Events announced today that Column Technologies will exhibit at SYS-CON's @DevOpsSummit at Cloud Expo, which will take place on June 7-9, 2016, at the Javits Center in New York City, NY. Established in 1998, Column Technologies is a global technology solutions provider with over 400 employees, headquartered in the United States with offices in Canada, India, and the United Kingdom. Column Technologies provides “Best of Breed” technology solutions that automate the key DevOps principal...
SYS-CON Events announced today that SoftLayer, an IBM Company, has been named “Gold Sponsor” of SYS-CON's 18th Cloud Expo, which will take place on June 7-9, 2016, at the Javits Center in New York, New York. SoftLayer, an IBM Company, provides cloud infrastructure as a service from a growing number of data centers and network points of presence around the world. SoftLayer’s customers range from Web startups to global enterprises.