Click here to close now.




















Welcome!

Microservices Expo Authors: Carmen Gonzalez, VictorOps Blog, Pat Romanski, Elizabeth White, Samuel Scott

Related Topics: Microservices Expo

Microservices Expo: Article

In at the Deep End: Training for High Performance Distributed Systems

Six best practices for training new engineers

Here at Logentries we have a simple philosophy when it comes to hiring: hire the best people we can find and let them jump in at the deep end. That is how we like to learn. Smart people like to go deep and then find out what they don’t know as they work through some real world problems. And, our job is to give them the mentoring and support they need to overcome the blockers quickly and continue the learning process. We ensure they come to us with a great computer science or mathematical background and we take it from there.

Logentries TrainingWhen we thought about designing a training program we built it around the following set of problems:

1. CAP Theorem

Learn to accept that consistency, availability and partition tolerance cannot be provided simultaneously. Decide what is important for your customers and make the best trade-offs you can. But, what you build needs to be understandable. Raft provides a consensus algorithm whose aim is to be understandable, or at least to be easier to comprehend than Paxos or Paxos-influenced algorithms. It’s a good place to start to understand the distributed systems challenge. So we start with the theory then the next step we get practical…

2. Embrace Hardware Failures

Logentries runs is an Amazon EC2-based service – an environment where failures in parts of the system are routine. Netflix has taken this approach to its natural conclusion. If you expect failure and know that when it happens it’s going to happen at the worst possible time; why not plan for it to happen all time. So they built Chaos Monkey, a tool that randomly generates various kinds of failures in your environment. You can run this to schedule so that failures happen when you are closely monitoring things. By combining Raft + Chaos Monkey – you get to understand what happens when things go wrong and how to build fault tolerant systems that can deal with any failure.

3. Embrace the Asynchronous

Blocking calls will kill performance and may impact availability. The slowest system in a chain will cause a backlog on the other nodes. Asynchronous design allows you to avoid the cascade effects that can occur when one node in the system brings down other nodes. It is the best way to build fault tolerant systems. So we start by building a system that relies on synchronous calls, show what happens under extreme load or when failures occur and then evolve this to work asynchronously.

4. Mechanical Sympathy

The LMAX Disruptor team understands that to get every last drop of performance out of your system you sometimes need to know every last detail of the hardware environment you are running on. They call it Mechanical Sympathy. They took what they knew about modern day processor architectures (caches, memory barriers, compare and swap etc.) and turned it into a software solution for high performance inter-thread messaging, which leads to massive performance improvements when used correctly.  So next we show our graduates what happens when you move from queue-based interactions to using Disruptor.

5. Horizontal over Vertical Scaling

Vertical scaling can give you quick wins in performance but when it comes to building a business like Logentries, where subscriber growth outpaces technology, the only way to design systems to scale with your success is to scale them horizontally. Don’t wait for technology to set limits on your growth. Also it’s better to scale on commodity hardware or, in our case, to scale on the EC2 servers that give you the best price-performance rather than rely on the latest premium-priced bells and whistles. When it comes to practical side of scaling we focus on splitting out the load but we plan for multiple axes of scale. Scaling out horizontally gives us the first split, where transactions are split over nodes. But we also ensure we have options for scaling by partitioning by types of transactions (log queries, API calls, live tail requests etc.), and by customer etc.

6. Talent Borrows, Genius Steals

We work only ten minutes away from where Oscar Wilde was born (Merrion Square, Dublin), so we do like to keep his words in mind. There are so many great distributed technologies around today – and they are getting better all the time. We have been inspired by Kafka, Hadoop, etcd, BDAS etc. But we like to control our own destiny and there are some very good reasons why we have built our platform from the ground up to do what it does in the best way it can. Turning logs into business insights isn’t easy – so we do like to stand on the shoulders of some distributed systems giants. So the final steps in our training program is to evaluate other distributed systems in the context of what we have learnt in the previous steps. Learn how Kafka implements mechanical sympathy when maximizing disk I/O. Understand Raft in the context of etcd. Understand the challenges on scaling analytics in the context of Hadoop.

Over the next few months – as we put our training program into practice we hope to have some of graduates blog about their experiences. Stay tuned!

More Stories By Trevor Parsons

Trevor Parsons is Chief Scientist and Co-founder of Logentries. Trevor has over 10 years experience in enterprise software and, in particular, has specialized in developing enterprise monitoring and performance tools for distributed systems. He is also a research fellow at the Performance Engineering Lab Research Group and was formerly a Scientist at the IBM Center for Advanced Studies. Trevor holds a PhD from University College Dublin, Ireland.

@MicroservicesExpo Stories
The 17th International Cloud Expo has announced that its Call for Papers is open. 17th International Cloud Expo, to be held November 3-5, 2015, at the Santa Clara Convention Center in Santa Clara, CA, brings together Cloud Computing, APM, APIs, Microservices, Security, Big Data, Internet of Things, DevOps and WebRTC to one location. With cloud computing driving a higher percentage of enterprise IT budgets every year, it becomes increasingly important to plant your flag in this fast-expanding bu...
The 5th International DevOps Summit, co-located with 17th International Cloud Expo – being held November 3-5, 2015, at the Santa Clara Convention Center in Santa Clara, CA – announces that its Call for Papers is open. Born out of proven success in agile development, cloud computing, and process automation, DevOps is a macro trend you cannot afford to miss. From showcase success stories from early adopters and web-scale businesses, DevOps is expanding to organizations of all sizes, including the ...
Early in my DevOps Journey, I was introduced to a book of great significance circulating within the Web Operations industry titled The Phoenix Project. (You can read our review of Gene’s book, if interested.) Written as a novel and loosely based on many of the same principles explored in The Goal, this book has been read and referenced by many who have adopted DevOps into their continuous improvement and software delivery processes around the world. As I began planning my travel schedule last...
SYS-CON Events announced today that the "Second Containers & Microservices Expo" will take place November 3-5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. Containers and microservices have become topics of intense interest throughout the cloud developer and enterprise IT communities.
DevOps Summit, taking place Nov 3-5, 2015, at the Santa Clara Convention Center in Santa Clara, CA, is co-located with 17th Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. The widespread success of cloud computing is driving the DevOps revolution in enterprise IT. Now as never before, development teams must communicate and collaborate in a dynamic, 24/7/365 environment. There is no time to wait for long development...
Akana has announced the availability of the new Akana Healthcare Solution. The API-driven solution helps healthcare organizations accelerate their transition to being secure, digitally interoperable businesses. It leverages the Health Level Seven International Fast Healthcare Interoperability Resources (HL7 FHIR) standard to enable broader business use of medical data. Akana developed the Healthcare Solution in response to healthcare businesses that want to increase electronic, multi-device acce...
Skeuomorphism usually means retaining existing design cues in something new that doesn’t actually need them. However, the concept of skeuomorphism can be thought of as relating more broadly to applying existing patterns to new technologies that, in fact, cry out for new approaches. In his session at DevOps Summit, Gordon Haff, Senior Cloud Strategy Marketing and Evangelism Manager at Red Hat, discussed why containers should be paired with new architectural practices such as microservices rathe...
SYS-CON Events announced today the Containers & Microservices Bootcamp, being held November 3-4, 2015, in conjunction with 17th Cloud Expo, @ThingsExpo, and @DevOpsSummit at the Santa Clara Convention Center in Santa Clara, CA. This is your chance to get started with the latest technology in the industry. Combined with real-world scenarios and use cases, the Containers and Microservices Bootcamp, led by Janakiram MSV, a Microsoft Regional Director, will include presentations as well as hands-on...
SYS-CON Events announced today that Pythian, a global IT services company specializing in helping companies leverage disruptive technologies to optimize revenue-generating systems, has been named “Bronze Sponsor” of SYS-CON's 17th Cloud Expo, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. Founded in 1997, Pythian is a global IT services company that helps companies compete by adopting disruptive technologies such as cloud, Big Data, advance...
17th Cloud Expo, taking place Nov 3-5, 2015, at the Santa Clara Convention Center in Santa Clara, CA, will feature technical sessions from a rock star conference faculty and the leading industry players in the world. Cloud computing is now being embraced by a majority of enterprises of all sizes. Yesterday's debate about public vs. private has transformed into the reality of hybrid cloud: a recent survey shows that 74% of enterprises have a hybrid cloud strategy. Meanwhile, 94% of enterprises ar...
DevOps has traditionally played important roles in development and IT operations, but the practice is quickly becoming core to other business functions such as customer success, business intelligence, and marketing analytics. Modern marketers today are driven by data and rely on many different analytics tools. They need DevOps engineers in general and server log data specifically to do their jobs well. Here’s why: Server log files contain the only data that is completely full and accurate in th...
Several years ago, I was a developer in a travel reservation aggregator. Our mission was to pull flight and hotel data from a bunch of cryptic reservation platforms, and provide it to other companies via an API library - for a fee. That was before companies like Expedia standardized such things. We started with simple methods like getFlightLeg() or addPassengerName(), each performing a small, well-understood function. But our customers wanted bigger, more encompassing services that would "do ...
Culture is the most important ingredient of DevOps. The challenge for most organizations is defining and communicating a vision of beneficial DevOps culture for their organizations, and then facilitating the changes needed to achieve that. Often this comes down to an ability to provide true leadership. As a CIO, are your direct reports IT managers or are they IT leaders? The hard truth is that many IT managers have risen through the ranks based on their technical skills, not their leadership ab...
Whether you like it or not, DevOps is on track for a remarkable alliance with security. The SEC didn’t approve the merger. And your boss hasn’t heard anything about it. Yet, this unruly triumvirate will soon dominate and deliver DevSecOps faster, cheaper, better, and on an unprecedented scale. In his session at DevOps Summit, Frank Bunger, VP of Customer Success at ScriptRock, will discuss how this cathartic moment will propel the DevOps movement from such stuff as dreams are made on to a prac...
SYS-CON Events announced today that HPM Networks will exhibit at the 17th International Cloud Expo®, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. For 20 years, HPM Networks has been integrating technology solutions that solve complex business challenges. HPM Networks has designed solutions for both SMB and enterprise customers throughout the San Francisco Bay Area.
In his session at 17th Cloud Expo, Ernest Mueller, Product Manager at Idera, will explain the best practices and lessons learned for tracking and optimizing costs while delivering a cloud-hosted service. He will describe a DevOps approach where the applications and systems work together to track usage, model costs in a granular fashion, and make smart decisions at runtime to minimize costs. The trickier parts covered include triggering off the right metrics; balancing resilience and redundancy ...
The pricing of tools or licenses for log aggregation can have a significant effect on organizational culture and the collaboration between Dev and Ops teams. Modern tools for log aggregation (of which Logentries is one example) can be hugely enabling for DevOps approaches to building and operating business-critical software systems. However, the pricing of an aggregated logging solution can affect the adoption of modern logging techniques, as well as organizational capabilities and cross-team ...
Docker containerization is increasingly being used in production environments. How can these environments best be monitored? Monitoring Docker containers as if they are lightweight virtual machines (i.e., monitoring the host from within the container), with all the common metrics that can be captured from an operating system, is an insufficient approach. Docker containers can’t be treated as lightweight virtual machines; they must be treated as what they are: isolated processes running on hosts....
In today's digital world, change is the one constant. Disruptive innovations like cloud, mobility, social media, and the Internet of Things have reshaped the market and set new standards in customer expectations. To remain competitive, businesses must tap the potential of emerging technologies and markets through the rapid release of new products and services. However, the rigid and siloed structures of traditional IT platforms and processes are slowing them down – resulting in lengthy delivery ...
With major technology companies and startups seriously embracing IoT strategies, now is the perfect time to attend @ThingsExpo in Silicon Valley. Learn what is going on, contribute to the discussions, and ensure that your enterprise is as "IoT-Ready" as it can be! Internet of @ThingsExpo, taking place Nov 3-5, 2015, at the Santa Clara Convention Center in Santa Clara, CA, is co-located with 17th Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading in...