Microservices Expo Authors: Stackify Blog, Aruna Ravichandran, Dalibor Siroky, Kevin Jackson, PagerDuty Blog

Related Topics: @DevOpsSummit, Microservices Expo, Open Source Cloud, Containers Expo Blog, Agile Computing, Cloud Security, FinTech Journal

@DevOpsSummit: Article

Intuit’s DevSecOps | @DevOpsSummit #DevOps #Gamification #Microservices

An interview with the DevSecOps team at Intuit that covers what processes they use to get company buy-in on DevOps adoption

Wow, if you ever wanted to learn about Rugged DevOps (some call it DevSecOps), sit down for a spell with Shannon Lietz, Ian Allison and Scott Kennedy from Intuit. We discussed a number of important topics including internal war games, culture hacking, gamification of Rugged DevOps and starting as a small team. There are 100 gold nuggets in this conversation for novices and experts alike.

Derek: I have some of the Intuit DevSecOps team here with me today. We're going to talk to them a little bit about Rugged DevOps and how things work over at Intuit. Let's start with some introductions.

Ian: I'm Ian Allison. I help run the Red Team at Intuit, which is, I guess you'd say, an interesting way of taking control of security at our company. We try to get ahead of the attackers by basically being the attackers. We're essentially ethical hackers. We go after all of our own stuff to make sure we can find where the deficiencies lie in all of our software.

Shannon: I'm Shannon Lietz. I've been working at Intuit three-and-a-half years and helped to found the 24x7 DevSecOps capability at Intuit, leading the Red Team, our security operations capability, our cyber SOC, and what we also consider "blue teaming": being able to hunt for defects.

The organization has really had to transform how we do software development because we're a 30-year-old software company. We are now seeing the traditional way of putting together software really embracing DevOps. For us, it's been exciting to really work in the industry with Rugged DevOps, trying to help build security into the DevOps movement.

Scott: I'm Scott Kennedy. I run the forensics and threat intelligence part of cyber work.

Derek: Shannon (@devsecops), tell me a little bit about software supply chains and how that vision of software development has impacted the way you see things at Intuit.

Shannon: That's really a great question. It was interesting when Josh Corman and I first talked; we had a lot in common. One of those things was the software supply chain. What I really love about the concept is being able to have processes driven a certain way so that you can reduce defects.

Having worked for Toyota in the past and understanding the supply chain mentality, you get a sense of how you could put something together better, incrementing on it, figuring out how to share that process, and then really figuring out what things are important. Having that notion of fewer, better suppliers was really a core concept.

I love the idea of transparency, building things a certain way, and really getting into continuous improvement. You need to look at things from an opportunities perspective - making sure you're not just looking to make things perfect. You're looking for those opportunities to improve over time.

Derek: As we think about Rugged DevOps within your security team, how do you measure the success of what you're doing? What kind of metrics are you looking at that matter to the business?

Shannon: We measure everything. For example, mean time to remediation (MTTR). Once somebody finds a defect, we analyze that defect from the time it got into the supply chain to when it actually gets resolved. We track everything from mean time to remediation to when the ticket was created, to being able to look at when the code actually got published, to when it actually got found, and then we work on those things over time. We really try to uplift.

Once somebody finds a defect, we analyze that defect from the time it got into the supply chain to when it actually gets resolved.

We leverage JIRA just like a software development team does. We register our defects and figure out how to get development teams to take responsibility for those ideas. It goes through their process of release and regression testing. As part of that, we look back to see where our opportunities are.

As an example, we started out where things may have taken weeks. We then reduced it down to days and ultimately got it down to hours. We've seen defect resolution where it's now minutes. When it's something we've discovered that was just a mistake by an engineer, we realize "mistakes do happen." We found that our cycle times also help us to find fault stack vulnerabilities in real time because we get to do end-to-end testing more aggressively utilizing this method.

Derek: How has consistency in your operations helped with Rugged DevOps and has it fragility within the organization?

Ian: One of the things we do is to utilize a golden image for all of the AMIs (Amazon Machine Image) we use, for all of our customers, and we require everybody to use these AMIs. We've also built some really interesting automation around scanning these AMIs. So one thing we realized quickly when we first started native U.S., when we try to do full vulnerability scans against another system, if it's set up to autoscale, we all of a sudden have 50 systems. Right? We can't ... it's really hard to do a full vulnerability scan right against the system, so we came up with a way to share back all of the AMIs with a special account. Then we bring those up and we scan them. Then we grade them.

Based upon the vulnerabilities that are found, you'll get a letter grade, like A through F, based upon the system you have. While we always strive to have our base image be an A, people continue to run on older images. But they get graded, and those grades get pushed up, so everybody in their org structure gets to see what the grade is for their account. I think by being a little standardized, basically with these images, lets us know what's in everything, and we have a grade for everyone. It helps everyone have a really good idea of where they stand when it comes to a security standpoint.

Based upon the vulnerabilities that are found, you'll get a letter grade, like A through F ... so everybody in their org structure gets to see what the grade is for their account.

Derek: That's not only a grading but a policy enforcement governance kind of role that grading plays. How rapid is the feedback loop in that grading system for the teams that you're working with?

Shannon: It's really quick, and we've discovered through some science that having component-based resources like AMIs provides us with an advantage when doing things like remediating vulnerabilities. Using AMI-based resources, we have seen that when there's a defect in it, we can find and remediate all of the defective AMIs quickly. That improves everyone's security across the company.

So if you're just picking out really good components, keeping track of those components and adding security into them, then you'll actually see a different effect across our pipeline. A single change can actually have a dramatic effect on reducing the problems within the pipeline.

Ian: It's really interesting. This morning I got an email from somebody that said, "Why did our baseline AMI go from an A to a C today?"

We had just received notice of a new vulnerability. Our stuff caught it, we scanned it, and we pushed the grade out to our portal where all our customers go to look at the grades. Our customers were able to see that change quickly.

They could now say, "Wow, it changed from an A to a C in less than 12 hours." I think the feedback is really important. The other important thing is that we have people going and looking. I wouldn't be getting emails about why has this changed if people aren't actually looking and wanting to make their grades better.

Derek: You mentioned customers. Are these internal customers?

Ian: Internal.

Shannon: Yeah, for our development teams, we as a security team really have changed how we think about things. It used to be that the security team would go out and govern. Basically, you got the fear of the security team coming in, descending upon you.

We've really changed how that happens within our organization. We grade our resource components, and we grade the way in which our applications come together. That changes how developers want to operate because they really want to figure out how to get better grades in security. And it creates a learning dynamic that incentivizes somebody to improve continuously.

That changes how developers want to operate because they really want to figure out how to get better grades in security. And it creates a learning dynamic that incentivizes somebody to improve continuously.

Derek: Does it create a competitiveness or gamification of who has better grades?

Shannon: Absolutely, which is why we did it in the first place. To your point there, gamification is something where when you start to grade components like that, you can actually start to leverage a leaderboard concept. We do have leaderboards as part of this. We have APIs where you can actually pull down your grades and include them in your automation. With these, you can make governance decisions.

If you sort of have that "game afoot," right, your leaders can then ask for specific grades within their pipeline. That up-levels the system, and you just see a continuous improvement lifecycle come to bear. Ultimately, you see fewer defects, and ultimately, you get to the notion of Six Sigma in our way of thinking. DevOps is really about continuous improvement and embracing automation. Embracing that concept allows us to get to fewer defects faster.

DevOps is really about continuous improvement and embracing automation. Embracing that concept allows us to get to fewer defects faster.

Derek: As you embraced continuous practices and DevOps practices, were there points when you realized certain old ways of doing things weren't going to enable you to move forward?

Scott: In looking at the progression of what we've been doing, one of the decisions that was made in Intuit and one of the things that I saw was really unique was the way they decided we were going to migrate into AWS. Our idea was to have the chaos team be the first people out, and that's the security team. So the security team was the one that was going out and finding out how to use each of the products that AWS has and creating the concept of whitelisting. Each product was rated as to whether or not it met security's requirement.

Therefore, no team can go ahead and pull down this new cool tool that AWS released yesterday and use it in production because it's not been "whitelisted." That can go into their scoring. Their scoring is not only used by the development teams but also is useful when reporting to the board. When the board asks, "How are we doing as a company across the entire organization?" we can say that product A got a lower score than product B, and then they turn to the VP in charge of it and say, "Well, why?"

When the board asks, "How are we doing as a company across the entire organization?" we can say that product A got a lower score than product B, and then they turn to the VP in charge of it and say, "Well, why?"

We decided to not rush into the cloud but to take a careful, considered approach and migrate in a very intelligent and well-thought-out way. At the same time, we gave the chaos team the ability to make the mistakes and grow and learn, so they can immediately turn around and share the mistakes with everyone else. They could say, "Hey, these are the things that didn't work for us. We came across a lot of problems, especially when you look at things like accounts and account roles."

How do you control when you have thousands of accounts and you need to have some sort of administrative control?

You can either have a gigantic effort to force your namespace and your Active Directory to be the source of control. Or you can use the vendor-specific tools like IAM and have each account have their own separate islands, but with the concept of cross-account roles, you can then do remote administration from a centralized account. You have it locked down. You have the capability to have a restricted group and be able to remotely go in and do the necessary actions.

That also gives you an audit trail. That also gives you multifactor built in because the AWS products get those things added to them.

Shannon: I think when it comes down to it, I think culture-hacking your environment can have a profound effect, especially when you're going through a DevOps transformation.

Derek: What is culture hacking?

Shannon: That's a great question. We use it when really trying to figure out how we as a security team can change and transform. A lot of the things that take place in a company are really based on traditional processes: What has worked before, and why would we change something that is working, right? If you're really going to go into an innovative frame; if you're really going to get into that next-generation innovation; if you are trying to figure out what's going to work in that ... it's never going to be the thing that is working. It's going to be the thing that you're going to learn as you go to that next step.

Culture hacking is really about looking at the people who are operating right now and trying to figure out how you're going to help them go from A to B, making that change. What is that the experience going to be like?

What we have done, to Scott's point, is we've forced our security team to have empathy for the DevOps teams. We go through the process of developing something in the cloud, utilizing it as a method of taking their paranoia and trying to balance the notion of getting something done within a specific time frame. We try to really wrangle what it takes to do those things securely and safely but, ultimately, still be able to deliver for the business.

I think that culture hacking really comes into play when you're trying to figure out how to move somebody from the rock they're on to the rock you need them to be on and trying to figure out what those mechanisms are.

Culture hacking really comes into play when you're figuring out how to move somebody from the rock they're on to the rock you need them to be on.

Derek: Part of your security practice is looking at open source and third-party components and your own binaries. Can you shed some light on how Intuit is using Sonatype solutions to better manage those vulnerabilities?

Shannon: Yeah, Sonatype Nexus is a fantastic platform. We love the Nexus repositories. We love how you guys put together a community. We learn a lot.

A lot of our DevOps practice is working together with it. We've put together our Nexus repositories to do code signing and figuring out how to really secure our pipelines a certain way. We are taking advantage of the fact that we can pick up components, track them and then scan them [for known vulnerabilities].

That's allowed us to reduce the defect count that goes to production. Actually scanning and looking for vulnerabilities within our components and our open source libraries allows us to make better decisions about what we're including in our software.

Derek: When you govern what open source, third-party or proprietary components are being used by developers, is there any kind of feedback from the teams saying, "Hey, you're restricting my behavior, not improving my innovation"?

Shannon: What we've found is that the notion of security approvals, exceptions and gates really doesn't work. Quite often, you just create a culture where developers are going to go out and do it, and then you're going to find out about it. When it comes to really partnering and being boundaryless about how you think about security in your business, it's all about transparency. It's all about benefits. It's creating things like a security markdown file within your repository manager. It's about taking responsibility and accountability for the things that you're doing from a security perspective in your development process. It's ultimately having an attacks.md file, keeping track of what's out there, keeping track of your open source, understanding what components you're leveraging, and why you made the decisions that you made to bring those things into your project.

It's about taking responsibility and accountability for the things that you're doing from a security perspective in your development process.

At a top level, all of those things work. But really having tools that can help the decisions that were made by some of the other open source programmers that you're getting contributions from is really necessary. All of the things that they might be deciding are also part of your decision tree, and ultimately, you're rolling all of that and bundling it together. The attack surface is not just the decisions that your team is making, but the ones that you share across the code base that you've got.

Derek: Your practices are very mature. You've clearly developed them over a long time, and some people watching this might think, "Well, Intuit's a huge organization," and it may be daunting to them if they haven't started down the path of Rugged DevOps. Can you be a small team and have success in these kind of practices?

Shannon: We're not exactly a huge organization, but we are relatively large in size now. When we got started, I believe I was one of maybe three people that started this, only a couple years ago. We have hired into our group pretty extensively to help grow it, and some of the things that we've done have really allowed us to operate differently, to bring in people and have them immediately be successful. Our practices allow someone "day one" to be able to work with the environment, to be able to develop code, to be able to contribute code that week.

We do things like weekly demos, where we actually do video demos. A person has to come in, program something, secure something, operate it and create a demo, all within their first week. So having the right bar for those folks is really important, but more importantly, our Red Team leader here (points to Ian), he came in and just is amazing, has created a Red Team pretty much out of thin air. So is having somebody from forensics, who's just done an incredible job to help us, to make it so that we have a lifecycle where we can snapshot something and be able to learn from it when it's actually offline.

A person has to come in, program something, secure something, operate it and create a demo, all within their first week.

Those are the types of practices where you start to extend yourself past the normal baseline practices of processes today and really think past that about how you're going to support innovation. You get into it very quickly. You get a learning culture. You get people who know that making mistakes - and figuring out how to learn from them - is okay. That's a really important part of that actual culture that you're putting in place.

Ian: Yeah, I was going to say, it's all about iteration, right? We started small, and we just continually iterate on what we're doing to try to get better and be better at what we all do.

When I first started this journey, I was a security guy - a pen tester. It was always the developer's fault. Developers always made the mistakes. I always had to clean up after them. But after six months of developing Ruby APIs and Ruby and working my butt off in code, the empathy was there.

I really understand what the developers are going through and why they make the choices they do. But I think by allowing us to help them, by creating tooling that allows them to self-serve, to understand it without making them ... helps them make themselves more secure without them having to become a security professional. I think that's kind of our ultimate goal.

Shannon: Being friendly hackers, right? Basically going out and attacking them so their applications don't get attacked by external attackers is really part of that frame.

Image title

The Journey to DevSecOps from SeniorStoryteller

Scott: The Red Team shift at the company has been profound because you see how people react. When the Red Team started, it was not as well shared, and a lot of people suddenly were very upset that they were attacked by the Red Team. But when it was pointed out, "Well, what would you rather have happen? Would you rather have somebody in China do this to you, that didn't work with you, didn't sit next to you and help you fix the product, or would you like to have a friend who, by the way, their job is to attack?"

When we went through several drills and actually practiced the muscle of defending the company against an attack, people were upset. "Oh, I had to do all this work."

My response to them: "Well, you did the right work."

"You did the right thing. You saw something bad. You did it. You did good. You practiced the muscle. Now when it happens again and it's not the Red Team, I know that you'll know what to do. You know that the process works, and we can actually defend the company faster and more securely."

You know that the process works, and we can actually defend the company faster and more securely.

Derek: Yeah. That's an incredible story. Thank you for sharing it.

My final question: If you could pick a superpower in dev, security or ops that you would have in the organization, what would it be?

Ian: To me, they're all alike; they're the same, right? That's what we do, DevSecOps, right? We try not to actually separate them out because I think once you start to separate them out, you start to lose perspective.

Scott: Yep.

Ian: There's a good thing about having them all be one thing, so I'd choose them all.

Scott: It's been pretty consistent. DevSecOps is the answer. What was the question? (Laughter)

Shannon: I think the reason we went out and created DevSecOps was just simply to change how we thought about doing development and technology - and to really to get ahead of it, to realize that attackers weren't setting up appointments or meetings to help you figure out how they were going to attack your software, and so then why were we? Why were we operating at a fragile level?

I think that the superpower that I would like to have is DevSecOps because I know that we are going through the process of creating a less-fragile capability in security that will allow us to get ahead of attackers, make it much harder for them to go after the software that gets built, and we're seeing those improvements. That's actually a great thing.

Derek: It sounds really exciting, and it's very cool, so thank you all very much. I really appreciate it.

All: Thank you.

If you loved this interview and are looking for more great stuff on Rugged DevOps, I invite you to download this awesome research paper from Amy DeMartine at Forrester, "The 7 Habits of Rugged DevOps."

Amy DeMartine

As Amy notes, "DevOps practices can only increase speed and quality up to a point without security and risk (S&R) pros' expertise. Old application security practices hinder speedy releases, and security vulnerabilities represent defects that can leave a company open to cyberattacks. But DevOps practitioners can leap forward with both increased speed and quality by including S&R pros in DevOps feedback loops and including security practices in the automated life cycle. These new practices are called Rugged DevOps."

More Stories By Derek Weeks

In 2015, Derek Weeks led the largest and most comprehensive analysis of software supply chain practices to date across 160,000 development organizations. He is a huge advocate of applying proven supply chain management principles into DevOps practices to improve efficiencies, reduce costs, and sustain long-lasting competitive advantages.

As a 20+ year veteran of the software industry, he has advised leading businesses on IT performance improvement practices covering continuous delivery, business process management, systems and network operations, service management, capacity planning and storage management. As the VP and DevOps Advocate for Sonatype, he is passionate about changing the way people think about software supply chains and improving public safety through improved software integrity. Follow him here @weekstweets, find me here www.linkedin.com/in/derekeweeks, and read me here http://blog.sonatype.com/author/weeks/.

@MicroservicesExpo Stories
How is DevOps going within your organization? If you need some help measuring just how well it is going, we have prepared a list of some key DevOps metrics to track. These metrics can help you understand how your team is doing over time. The word DevOps means different things to different people. Some say it a culture and every vendor in the industry claims that their tools help with DevOps. Depending on how you define DevOps, some of these metrics may matter more or less to you and your team.
For many of us laboring in the fields of digital transformation, 2017 was a year of high-intensity work and high-reward achievement. So we’re looking forward to a little breather over the end-of-year holiday season. But we’re going to have to get right back on the Continuous Delivery bullet train in 2018. Markets move too fast and customer expectations elevate too precipitously for businesses to rest on their laurels. Here’s a DevOps “to-do list” for 2018 that should be priorities for anyone w...
If testing environments are constantly unavailable and affected by outages, release timelines will be affected. You can use three metrics to measure stability events for specific environments and plan around events that will affect your critical path to release.
In a recent post, titled “10 Surprising Facts About Cloud Computing and What It Really Is”, Zac Johnson highlighted some interesting facts about cloud computing in the SMB marketplace: Cloud Computing is up to 40 times more cost-effective for an SMB, compared to running its own IT system. 94% of SMBs have experienced security benefits in the cloud that they didn’t have with their on-premises service
DevOps failure is a touchy subject with some, because DevOps is typically perceived as a way to avoid failure. As a result, when you fail in a DevOps practice, the situation can seem almost hopeless. However, just as a fail-fast business approach, or the “fail and adjust sooner” methodology of Agile often proves, DevOps failures are actually a step in the right direction. They’re the first step toward learning from failures and turning your DevOps practice into one that will lead you toward even...
DevOps is under attack because developers don’t want to mess with infrastructure. They will happily own their code into production, but want to use platforms instead of raw automation. That’s changing the landscape that we understand as DevOps with both architecture concepts (CloudNative) and process redefinition (SRE). Rob Hirschfeld’s recent work in Kubernetes operations has led to the conclusion that containers and related platforms have changed the way we should be thinking about DevOps and...
While walking around the office I happened upon a relatively new employee dragging emails from his inbox into folders. I asked why and was told, “I’m just answering emails and getting stuff off my desk.” An empty inbox may be emotionally satisfying to look at, but in practice, you should never do it. Here’s why. I recently wrote a piece arguing that from a mathematical perspective, Messy Desks Are Perfectly Optimized. While it validated the genius of my friends with messy desks, it also gener...
The goal of Microservices is to improve software delivery speed and increase system safety as scale increases. Microservices being modular these are faster to change and enables an evolutionary architecture where systems can change, as the business needs change. Microservices can scale elastically and by being service oriented can enable APIs natively. Microservices also reduce implementation and release cycle time and enables continuous delivery. This paper provides a logical overview of the Mi...
The next XaaS is CICDaaS. Why? Because CICD saves developers a huge amount of time. CD is an especially great option for projects that require multiple and frequent contributions to be integrated. But… securing CICD best practices is an emerging, essential, yet little understood practice for DevOps teams and their Cloud Service Providers. The only way to get CICD to work in a highly secure environment takes collaboration, patience and persistence. Building CICD in the cloud requires rigorous ar...
The enterprise data storage marketplace is poised to become a battlefield. No longer the quiet backwater of cloud computing services, the focus of this global transition is now going from compute to storage. An overview of recent storage market history is needed to understand why this transition is important. Before 2007 and the birth of the cloud computing market we are witnessing today, the on-premise model hosted in large local data centers dominated enterprise storage. Key marketplace play...
The cloud revolution in enterprises has very clearly crossed the phase of proof-of-concepts into a truly mainstream adoption. One of most popular enterprise-wide initiatives currently going on are “cloud migration” programs of some kind or another. Finding business value for these programs is not hard to fathom – they include hyperelasticity in infrastructure consumption, subscription based models, and agility derived from rapid speed of deployment of applications. These factors will continue to...
Some people are directors, managers, and administrators. Others are disrupters. Eddie Webb (@edwardawebb) is an IT Disrupter for Software Development Platforms at Liberty Mutual and was a presenter at the 2016 All Day DevOps conference. His talk, Organically DevOps: Building Quality and Security into the Software Supply Chain at Liberty Mutual, looked at Liberty Mutual's transformation to Continuous Integration, Continuous Delivery, and DevOps. For a large, heavily regulated industry, this task ...
Following a tradition dating back to 2002 at ZapThink and continuing at Intellyx since 2014, it’s time for Intellyx’s annual predictions for the coming year. If you’re a long-time fan, you know we have a twist to the typical annual prediction post: we actually critique our predictions from the previous year. To make things even more interesting, Charlie and I switch off, judging the other’s predictions. And now that he’s been with Intellyx for more than a year, this Cortex represents my first ...
"Grape Up leverages Cloud Native technologies and helps companies build software using microservices, and work the DevOps agile way. We've been doing digital innovation for the last 12 years," explained Daniel Heckman, of Grape Up in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
The Toyota Production System, a world-renowned production system is based on the "complete elimination of all waste". The "Toyota Way", grounded on continuous improvement dates to the 1860s. The methodology is widely proven to be successful yet there are still industries within and tangential to manufacturing struggling to adopt its core principles: Jidoka: a process should stop when an issue is identified prevents releasing defective products
We seem to run this cycle with every new technology that comes along. A good idea with practical applications is born, then both marketers and over-excited users start to declare it is the solution for all or our problems. Compliments of Gartner, we know it generally as “The Hype Cycle”, but each iteration is a little different. 2018’s flavor will be serverless computing, and by 2018, I mean starting now, but going most of next year, you’ll be sick of it. We are already seeing people write such...
Defining the term ‘monitoring’ is a difficult task considering the performance space has evolved significantly over the years. Lately, there has been a shift in the monitoring world, sparking a healthy debate regarding the definition and purpose of monitoring, through which a new term has emerged: observability. Some of that debate can be found in blogs by Charity Majors and Cindy Sridharan.
It’s “time to move on from DevOps and continuous delivery.” This was the provocative title of a recent article in ZDNet, in which Kelsey Hightower, staff developer advocate at Google Cloud Platform, suggested that “software shops should have put these concepts into action years ago.” Reading articles like this or listening to talks at most DevOps conferences might make you think that we’re entering a post-DevOps world. But vast numbers of organizations still struggle to start and drive transfo...
Let's do a visualization exercise. Imagine it's December 31, 2018, and you're ringing in the New Year with your friends and family. You think back on everything that you accomplished in the last year: your company's revenue is through the roof thanks to the success of your product, and you were promoted to Lead Developer. 2019 is poised to be an even bigger year for your company because you have the tools and insight to scale as quickly as demand requires. You're a happy human, and it's not just...
"Opsani helps the enterprise adopt containers, help them move their infrastructure into this modern world of DevOps, accelerate the delivery of new features into production, and really get them going on the container path," explained Ross Schibler, CEO of Opsani, and Peter Nickolov, CTO of Opsani, in this SYS-CON.tv interview at DevOps Summit at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.