Welcome!

SOA & WOA Authors: Bernard Golden, Elizabeth White, Carmen Gonzalez, Ivan Antsipau, Paul Speciale

Related Topics: SOA & WOA

SOA & WOA: Article

In at the Deep End: Training for High Performance Distributed Systems

Six best practices for training new engineers

Here at Logentries we have a simple philosophy when it comes to hiring: hire the best people we can find and let them jump in at the deep end. That is how we like to learn. Smart people like to go deep and then find out what they don’t know as they work through some real world problems. And, our job is to give them the mentoring and support they need to overcome the blockers quickly and continue the learning process. We ensure they come to us with a great computer science or mathematical background and we take it from there.

Logentries TrainingWhen we thought about designing a training program we built it around the following set of problems:

1. CAP Theorem

Learn to accept that consistency, availability and partition tolerance cannot be provided simultaneously. Decide what is important for your customers and make the best trade-offs you can. But, what you build needs to be understandable. Raft provides a consensus algorithm whose aim is to be understandable, or at least to be easier to comprehend than Paxos or Paxos-influenced algorithms. It’s a good place to start to understand the distributed systems challenge. So we start with the theory then the next step we get practical…

2. Embrace Hardware Failures

Logentries runs is an Amazon EC2-based service – an environment where failures in parts of the system are routine. Netflix has taken this approach to its natural conclusion. If you expect failure and know that when it happens it’s going to happen at the worst possible time; why not plan for it to happen all time. So they built Chaos Monkey, a tool that randomly generates various kinds of failures in your environment. You can run this to schedule so that failures happen when you are closely monitoring things. By combining Raft + Chaos Monkey – you get to understand what happens when things go wrong and how to build fault tolerant systems that can deal with any failure.

3. Embrace the Asynchronous

Blocking calls will kill performance and may impact availability. The slowest system in a chain will cause a backlog on the other nodes. Asynchronous design allows you to avoid the cascade effects that can occur when one node in the system brings down other nodes. It is the best way to build fault tolerant systems. So we start by building a system that relies on synchronous calls, show what happens under extreme load or when failures occur and then evolve this to work asynchronously.

4. Mechanical Sympathy

The LMAX Disruptor team understands that to get every last drop of performance out of your system you sometimes need to know every last detail of the hardware environment you are running on. They call it Mechanical Sympathy. They took what they knew about modern day processor architectures (caches, memory barriers, compare and swap etc.) and turned it into a software solution for high performance inter-thread messaging, which leads to massive performance improvements when used correctly.  So next we show our graduates what happens when you move from queue-based interactions to using Disruptor.

5. Horizontal over Vertical Scaling

Vertical scaling can give you quick wins in performance but when it comes to building a business like Logentries, where subscriber growth outpaces technology, the only way to design systems to scale with your success is to scale them horizontally. Don’t wait for technology to set limits on your growth. Also it’s better to scale on commodity hardware or, in our case, to scale on the EC2 servers that give you the best price-performance rather than rely on the latest premium-priced bells and whistles. When it comes to practical side of scaling we focus on splitting out the load but we plan for multiple axes of scale. Scaling out horizontally gives us the first split, where transactions are split over nodes. But we also ensure we have options for scaling by partitioning by types of transactions (log queries, API calls, live tail requests etc.), and by customer etc.

6. Talent Borrows, Genius Steals

We work only ten minutes away from where Oscar Wilde was born (Merrion Square, Dublin), so we do like to keep his words in mind. There are so many great distributed technologies around today – and they are getting better all the time. We have been inspired by Kafka, Hadoop, etcd, BDAS etc. But we like to control our own destiny and there are some very good reasons why we have built our platform from the ground up to do what it does in the best way it can. Turning logs into business insights isn’t easy – so we do like to stand on the shoulders of some distributed systems giants. So the final steps in our training program is to evaluate other distributed systems in the context of what we have learnt in the previous steps. Learn how Kafka implements mechanical sympathy when maximizing disk I/O. Understand Raft in the context of etcd. Understand the challenges on scaling analytics in the context of Hadoop.

Over the next few months – as we put our training program into practice we hope to have some of graduates blog about their experiences. Stay tuned!

More Stories By Trevor Parsons

Trevor Parsons is Chief Scientist and Co-founder of Logentries. Trevor has over 10 years experience in enterprise software and, in particular, has specialized in developing enterprise monitoring and performance tools for distributed systems. He is also a research fellow at the Performance Engineering Lab Research Group and was formerly a Scientist at the IBM Center for Advanced Studies. Trevor holds a PhD from University College Dublin, Ireland.

@ThingsExpo Stories
Explosive growth in connected devices. Enormous amounts of data for collection and analysis. Critical use of data for split-second decision making and actionable information. All three are factors in making the Internet of Things a reality. Yet, any one factor would have an IT organization pondering its infrastructure strategy. How should your organization enhance its IT framework to enable an Internet of Things implementation? In his session at Internet of @ThingsExpo, James Kirkland, Chief Architect for the Internet of Things and Intelligent Systems at Red Hat, will describe how to revoluti...
The Internet of Things will greatly expand the opportunities for data collection and new business models driven off of that data. In her session at Internet of @ThingsExpo, Esmeralda Swartz, CMO of MetraTech, will discuss how for this to be effective you not only need to have infrastructure and operational models capable of utilizing this new phenomenon, but increasingly service providers will need to convince a skeptical public to participate. Get ready to show them the money! Speaker Bio: Esmeralda Swartz, CMO of MetraTech, has spent 16 years as a marketing, product management, and busin...
Samsung VP Jacopo Lenzi, who headed the company's recent SmartThings acquisition under the auspices of Samsung's Open Innovaction Center (OIC), answered a few questions we had about the deal. This interview was in conjunction with our interview with SmartThings CEO Alex Hawkinson. IoT Journal: SmartThings was developed in an open, standards-agnostic platform, and will now be part of Samsung's Open Innovation Center. Can you elaborate on your commitment to keep the platform open? Jacopo Lenzi: Samsung recognizes that true, accelerated innovation cannot be driven from one source, but requires a...
SYS-CON Events announced today that Red Hat, the world's leading provider of open source solutions, will exhibit at Internet of @ThingsExpo, which will take place on November 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA. Red Hat is the world's leading provider of open source software solutions, using a community-powered approach to reliable and high-performing cloud, Linux, middleware, storage and virtualization technologies. Red Hat also offers award-winning support, training, and consulting services. As the connective hub in a global network of enterprises, partners, a...
P2P RTC will impact the landscape of communications, shifting from traditional telephony style communications models to OTT (Over-The-Top) cloud assisted & PaaS (Platform as a Service) communication services. The P2P shift will impact many areas of our lives, from mobile communication, human interactive web services, RTC and telephony infrastructure, user federation, security and privacy implications, business costs, and scalability. In his session at Internet of @ThingsExpo, Robin Raymond, Chief Architect at Hookflash Inc., will walk through the shifting landscape of traditional telephone a...
SYS-CON Events announced today that Matrix.org has been named “Silver Sponsor” of Internet of @ThingsExpo, which will take place on November 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA. Matrix is an ambitious new open standard for open, distributed, real-time communication over IP. It defines a new approach for interoperable Instant Messaging and VoIP based on pragmatic HTTP APIs and WebRTC, and provides open source reference implementations to showcase and bootstrap the new standard. Our focus is on simplicity, security, and supporting the fullest feature set.
BSQUARE is a global leader of embedded software solutions. We enable smart connected systems at the device level and beyond that millions use every day and provide actionable data solutions for the growing Internet of Things (IoT) market. We empower our world-class customers with our products, services and solutions to achieve innovation and success. For more information, visit www.bsquare.com.
How do APIs and IoT relate? The answer is not as simple as merely adding an API on top of a dumb device, but rather about understanding the architectural patterns for implementing an IoT fabric. There are typically two or three trends: Exposing the device to a management framework Exposing that management framework to a business centric logic • Exposing that business layer and data to end users. This last trend is the IoT stack, which involves a new shift in the separation of what stuff happens, where data lives and where the interface lies. For instance, it’s a mix of architectural style...
SYS-CON Events announced today that SOA Software, an API management leader, will exhibit at SYS-CON's 15th International Cloud Expo®, which will take place on November 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA. SOA Software is a leading provider of API Management and SOA Governance products that equip business to deliver APIs and SOA together to drive their company to meet its business strategy quickly and effectively. SOA Software’s technology helps businesses to accelerate their digital channels with APIs, drive partner adoption, monetize their assets, and achieve a...
From a software development perspective IoT is about programming "things," about connecting them with each other or integrating them with existing applications. In his session at @ThingsExpo, Yakov Fain, co-founder of Farata Systems and SuranceBay, will show you how small IoT-enabled devices from multiple manufacturers can be integrated into the workflow of an enterprise application. This is a practical demo of building a framework and components in HTML/Java/Mobile technologies to serve as a platform that can integrate new devices as they become available on the market.
SYS-CON Events announced today that Utimaco will exhibit at SYS-CON's 15th International Cloud Expo®, which will take place on November 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA. Utimaco is a leading manufacturer of hardware based security solutions that provide the root of trust to keep cryptographic keys safe, secure critical digital infrastructures and protect high value data assets. Only Utimaco delivers a general-purpose hardware security module (HSM) as a customizable platform to easily integrate into existing software solutions, embed business logic and build s...
Connected devices are changing the way we go about our everyday life, from wearables to driverless cars, to smart grids and entire industries revolutionizing business opportunities through smart objects, capable of two-way communication. But what happens when objects are given an IP-address, and we rely on that connection, sometimes with our lives? How do we secure those vast data infrastructures and safe-keep the privacy of sensitive information? This session will outline how each and every connected device can uphold a core root of trust via a unique cryptographic signature – a “bir...
Internet of @ThingsExpo Silicon Valley announced on Thursday its first 12 all-star speakers and sessions for its upcoming event, which will take place November 4-6, 2014, at the Santa Clara Convention Center in California. @ThingsExpo, the first and largest IoT event in the world, debuted at the Javits Center in New York City in June 10-12, 2014 with over 6,000 delegates attending the conference. Among the first 12 announced world class speakers, IBM will present two highly popular IoT sessions, which will take place November 4-6, 2014 at the Santa Clara Convention Center in Santa Clara, Calif...
Almost everyone sees the potential of Internet of Things but how can businesses truly unlock that potential. The key will be in the ability to discover business insight in the midst of an ocean of Big Data generated from billions of embedded devices via Systems of Discover. Businesses will also need to ensure that they can sustain that insight by leveraging the cloud for global reach, scale and elasticity.
WebRTC defines no default signaling protocol, causing fragmentation between WebRTC silos. SIP and XMPP provide possibilities, but come with considerable complexity and are not designed for use in a web environment. In his session at Internet of @ThingsExpo, Matthew Hodgson, technical co-founder of the Matrix.org, will discuss how Matrix is a new non-profit Open Source Project that defines both a new HTTP-based standard for VoIP & IM signaling and provides reference implementations.

SUNNYVALE, Calif., Oct. 20, 2014 /PRNewswire/ -- Spansion Inc. (NYSE: CODE), a global leader in embedded systems, today added 96 new products to the Spansion® FM4 Family of flexible microcontrollers (MCUs). Based on the ARM® Cortex®-M4F core, the new MCUs boast a 200 MHz operating frequency and support a diverse set of on-chip peripherals for enhanced human machine interfaces (HMIs) and machine-to-machine (M2M) communications. The rich set of periphera...

SYS-CON Events announced today that Aria Systems, the recurring revenue expert, has been named "Bronze Sponsor" of SYS-CON's 15th International Cloud Expo®, which will take place on November 4-6, 2014, at the Santa Clara Convention Center in Santa Clara, CA. Aria Systems helps leading businesses connect their customers with the products and services they love. Industry leaders like Pitney Bowes, Experian, AAA NCNU, VMware, HootSuite and many others choose Aria to power their recurring revenue business and deliver exceptional experiences to their customers.
The Internet of Things (IoT) is going to require a new way of thinking and of developing software for speed, security and innovation. This requires IT leaders to balance business as usual while anticipating for the next market and technology trends. Cloud provides the right IT asset portfolio to help today’s IT leaders manage the old and prepare for the new. Today the cloud conversation is evolving from private and public to hybrid. This session will provide use cases and insights to reinforce the value of the network in helping organizations to maximize their company’s cloud experience.
The Internet of Things (IoT) is making everything it touches smarter – smart devices, smart cars and smart cities. And lucky us, we’re just beginning to reap the benefits as we work toward a networked society. However, this technology-driven innovation is impacting more than just individuals. The IoT has an environmental impact as well, which brings us to the theme of this month’s #IoTuesday Twitter chat. The ability to remove inefficiencies through connected objects is driving change throughout every sector, including waste management. BigBelly Solar, located just outside of Boston, is trans...
SYS-CON Events announced today that Matrix.org has been named “Silver Sponsor” of Internet of @ThingsExpo, which will take place on November 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA. Matrix is an ambitious new open standard for open, distributed, real-time communication over IP. It defines a new approach for interoperable Instant Messaging and VoIP based on pragmatic HTTP APIs and WebRTC, and provides open source reference implementations to showcase and bootstrap the new standard. Our focus is on simplicity, security, and supporting the fullest feature set.