|By Trevor Parsons||
|April 18, 2014 10:00 AM EDT||
Here at Logentries we have a simple philosophy when it comes to hiring: hire the best people we can find and let them jump in at the deep end. That is how we like to learn. Smart people like to go deep and then find out what they don’t know as they work through some real world problems. And, our job is to give them the mentoring and support they need to overcome the blockers quickly and continue the learning process. We ensure they come to us with a great computer science or mathematical background and we take it from there.
Logentries TrainingWhen we thought about designing a training program we built it around the following set of problems:
1. CAP Theorem
Learn to accept that consistency, availability and partition tolerance cannot be provided simultaneously. Decide what is important for your customers and make the best trade-offs you can. But, what you build needs to be understandable. Raft provides a consensus algorithm whose aim is to be understandable, or at least to be easier to comprehend than Paxos or Paxos-influenced algorithms. It’s a good place to start to understand the distributed systems challenge. So we start with the theory then the next step we get practical…
2. Embrace Hardware Failures
Logentries runs is an Amazon EC2-based service – an environment where failures in parts of the system are routine. Netflix has taken this approach to its natural conclusion. If you expect failure and know that when it happens it’s going to happen at the worst possible time; why not plan for it to happen all time. So they built Chaos Monkey, a tool that randomly generates various kinds of failures in your environment. You can run this to schedule so that failures happen when you are closely monitoring things. By combining Raft + Chaos Monkey – you get to understand what happens when things go wrong and how to build fault tolerant systems that can deal with any failure.
3. Embrace the Asynchronous
Blocking calls will kill performance and may impact availability. The slowest system in a chain will cause a backlog on the other nodes. Asynchronous design allows you to avoid the cascade effects that can occur when one node in the system brings down other nodes. It is the best way to build fault tolerant systems. So we start by building a system that relies on synchronous calls, show what happens under extreme load or when failures occur and then evolve this to work asynchronously.
4. Mechanical Sympathy
The LMAX Disruptor team understands that to get every last drop of performance out of your system you sometimes need to know every last detail of the hardware environment you are running on. They call it Mechanical Sympathy. They took what they knew about modern day processor architectures (caches, memory barriers, compare and swap etc.) and turned it into a software solution for high performance inter-thread messaging, which leads to massive performance improvements when used correctly. So next we show our graduates what happens when you move from queue-based interactions to using Disruptor.
5. Horizontal over Vertical Scaling
Vertical scaling can give you quick wins in performance but when it comes to building a business like Logentries, where subscriber growth outpaces technology, the only way to design systems to scale with your success is to scale them horizontally. Don’t wait for technology to set limits on your growth. Also it’s better to scale on commodity hardware or, in our case, to scale on the EC2 servers that give you the best price-performance rather than rely on the latest premium-priced bells and whistles. When it comes to practical side of scaling we focus on splitting out the load but we plan for multiple axes of scale. Scaling out horizontally gives us the first split, where transactions are split over nodes. But we also ensure we have options for scaling by partitioning by types of transactions (log queries, API calls, live tail requests etc.), and by customer etc.
6. Talent Borrows, Genius Steals
We work only ten minutes away from where Oscar Wilde was born (Merrion Square, Dublin), so we do like to keep his words in mind. There are so many great distributed technologies around today – and they are getting better all the time. We have been inspired by Kafka, Hadoop, etcd, BDAS etc. But we like to control our own destiny and there are some very good reasons why we have built our platform from the ground up to do what it does in the best way it can. Turning logs into business insights isn’t easy – so we do like to stand on the shoulders of some distributed systems giants. So the final steps in our training program is to evaluate other distributed systems in the context of what we have learnt in the previous steps. Learn how Kafka implements mechanical sympathy when maximizing disk I/O. Understand Raft in the context of etcd. Understand the challenges on scaling analytics in the context of Hadoop.
Over the next few months – as we put our training program into practice we hope to have some of graduates blog about their experiences. Stay tuned!
SYS-CON Events announced today that Numerex Corp, a leading provider of managed enterprise solutions enabling the Internet of Things (IoT), will exhibit at the 19th International Cloud Expo | @ThingsExpo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Numerex Corp. (NASDAQ:NMRX) is a leading provider of managed enterprise solutions enabling the Internet of Things (IoT). The Company's solutions produce new revenue streams or create operating...
Sep. 26, 2016 01:00 AM EDT Reads: 1,949
Analysis of 25,000 applications reveals 6.8% of packages/components used included known defects. Organizations standardizing on components between 2 - 3 years of age can decrease defect rates substantially. Open source and third-party packages/components live at the heart of high velocity software development organizations. Today, an average of 106 packages/components comprise 80 - 90% of a modern application, yet few organizations have visibility into what components are used where.
Sep. 25, 2016 11:45 PM EDT Reads: 503
DevOps at Cloud Expo – being held November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA – announces that its Call for Papers is open. Born out of proven success in agile development, cloud computing, and process automation, DevOps is a macro trend you cannot afford to miss. From showcase success stories from early adopters and web-scale businesses, DevOps is expanding to organizations of all sizes, including the world's largest enterprises – and delivering real results. Am...
Sep. 25, 2016 10:45 PM EDT Reads: 4,408
If you’re responsible for an application that depends on the data or functionality of various IoT endpoints – either sensors or devices – your brand reputation depends on the security, reliability, and compliance of its many integrated parts. If your application fails to deliver the expected business results, your customers and partners won't care if that failure stems from the code you developed or from a component that you integrated. What can you do to ensure that the endpoints work as expect...
Sep. 25, 2016 10:45 PM EDT Reads: 1,581
SYS-CON Events announced today that Commvault, a global leader in enterprise data protection and information management, has been named “Bronze Sponsor” of SYS-CON's 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Commvault is a leading provider of data protection and information management solutions, helping companies worldwide activate their data to drive more value and business insight and to transform moder...
Sep. 25, 2016 09:30 PM EDT Reads: 2,494
Throughout history, various leaders have risen up and tried to unify the world by conquest. Fortunately, none of their plans have succeeded. The world goes on just fine with each country ruling itself; no single ruler is necessary. That’s how it is with the container platform ecosystem, as well. There’s no need for one all-powerful, all-encompassing container platform. Think about any other technology sector out there – there are always multiple solutions in every space. The same goes for conta...
Sep. 25, 2016 09:00 PM EDT Reads: 955
Let's recap what we learned from the previous chapters in the series: episode 1 and episode 2. We learned that a good rollback mechanism cannot be designed without having an intimate knowledge of the application architecture, the nature of your components and their dependencies. Now that we know what we have to restore and in which order, the question is how?
Sep. 25, 2016 07:00 PM EDT Reads: 1,202
About a year ago we tuned into “the need for speed” and how a concept like "serverless computing” was increasingly catering to this. We are now a year further and the term “serverless” is taking on unexpected proportions. With some even seeing it as the successor to cloud in general or at least as a successor to the clouds’ poorer cousin in terms of revenue, hype and adoption: PaaS. The question we need to ask is whether this constitutes an example of Hype Hopping: to effortlessly pivot to the ...
Sep. 25, 2016 05:45 PM EDT Reads: 851
SYS-CON Events announced today the Kubernetes and Google Container Engine Workshop, being held November 3, 2016, in conjunction with @DevOpsSummit at 19th Cloud Expo at the Santa Clara Convention Center in Santa Clara, CA. This workshop led by Sebastian Scheele introduces participants to Kubernetes and Google Container Engine (GKE). Through a combination of instructor-led presentations, demonstrations, and hands-on labs, students learn the key concepts and practices for deploying and maintainin...
Sep. 25, 2016 02:00 PM EDT Reads: 2,627
Enterprise IT has been in the era of Hybrid Cloud for some time now. But it seems most conversations about Hybrid are focused on integrating AWS, Microsoft Azure, or Google ECM into existing on-premises systems. Where is all the Private Cloud? What do technology providers need to do to make their offerings more compelling? How should enterprise IT executives and buyers define their focus, needs, and roadmap, and communicate that clearly to the providers?
Sep. 25, 2016 02:00 PM EDT Reads: 1,524
There is little doubt that Big Data solutions will have an increasing role in the Enterprise IT mainstream over time. Big Data at Cloud Expo - to be held November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA - has announced its Call for Papers is open. Cloud computing is being adopted in one form or another by 94% of enterprises today. Tens of billions of new devices are being connected to The Internet of Things. And Big Data is driving this bus. An exponential increase is...
Sep. 25, 2016 12:45 PM EDT Reads: 2,482
DevOps at Cloud Expo, taking place Nov 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA, is co-located with 19th Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. The widespread success of cloud computing is driving the DevOps revolution in enterprise IT. Now as never before, development teams must communicate and collaborate in a dynamic, 24/7/365 environment. There is no time to wait for long dev...
Sep. 25, 2016 12:15 PM EDT Reads: 3,395
The many IoT deployments around the world are busy integrating smart devices and sensors into their enterprise IT infrastructures. Yet all of this technology – and there are an amazing number of choices – is of no use without the software to gather, communicate, and analyze the new data flows. Without software, there is no IT. In this power panel at @ThingsExpo, moderated by Conference Chair Roger Strukhoff, panelists will look at the protocols that communicate data and the emerging data analy...
Sep. 25, 2016 11:00 AM EDT Reads: 1,580
All clouds are not equal. To succeed in a DevOps context, organizations should plan to develop/deploy apps across a choice of on-premise and public clouds simultaneously depending on the business needs. This is where the concept of the Lean Cloud comes in - resting on the idea that you often need to relocate your app modules over their life cycles for both innovation and operational efficiency in the cloud. In his session at @DevOpsSummit at19th Cloud Expo, Valentin (Val) Bercovici, CTO of So...
Sep. 25, 2016 10:00 AM EDT Reads: 1,359
Video experiences should be unique and exciting! But that doesn’t mean you need to patch all the pieces yourself. Users demand rich and engaging experiences and new ways to connect with you. But creating robust video applications at scale can be complicated, time-consuming and expensive. In his session at @ThingsExpo, Zohar Babin, Vice President of Platform, Ecosystem and Community at Kaltura, will discuss how VPaaS enables you to move fast, creating scalable video experiences that reach your...
Sep. 25, 2016 10:00 AM EDT Reads: 958
SYS-CON Events announced today the Enterprise IoT Bootcamp, being held November 1-2, 2016, in conjunction with 19th Cloud Expo | @ThingsExpo at the Santa Clara Convention Center in Santa Clara, CA. Combined with real-world scenarios and use cases, the Enterprise IoT Bootcamp is not just based on presentations but with hands-on demos and detailed walkthroughs. We will introduce you to a variety of real world use cases prototyped using Arduino, Raspberry Pi, BeagleBone, Spark, and Intel Edison. Y...
Sep. 25, 2016 06:30 AM EDT Reads: 2,839
With the rise of Docker, Kubernetes, and other container technologies, the growth of microservices has skyrocketed among dev teams looking to innovate on a faster release cycle. This has enabled teams to finally realize their DevOps goals to ship and iterate quickly in a continuous delivery model. Why containers are growing in popularity is no surprise — they’re extremely easy to spin up or down, but come with an unforeseen issue. However, without the right foresight, DevOps and IT teams may lo...
Sep. 25, 2016 04:30 AM EDT Reads: 927
As applications are promoted from the development environment to the CI or the QA environment and then into the production environment, it is very common for the configuration settings to be changed as the code is promoted. For example, the settings for the database connection pools are typically lower in development environment than the QA/Load Testing environment. The primary reason for the existence of the configuration setting differences is to enhance application performance. However, occas...
Sep. 25, 2016 03:45 AM EDT Reads: 716
When scaling agile / Scrum, we invariable run into the alignment vs autonomy problem. In short: you cannot have autonomous self directing teams if they have no clue in what direction they should go, or even shorter: Alignment breeds autonomy. But how do we create alignment? and what tools can we use to quickly evaluate if what we want to do is part of the mission or better left out? Niel Nickolaisen created the Purpose Alignment model and I use it with innovation labs in large enterprises to de...
Sep. 25, 2016 01:15 AM EDT Reads: 1,242
Information technology is an industry that has always experienced change, and the dramatic change sweeping across the industry today could not be truthfully described as the first time we've seen such widespread change impacting customer investments. However, the rate of the change, and the potential outcomes from today's digital transformation has the distinct potential to separate the industry into two camps: Organizations that see the change coming, embrace it, and successful leverage it; and...
Sep. 25, 2016 12:45 AM EDT Reads: 1,069