For development teams, process can often be antithetical to speed. Ease of deployment and security tend to have an inverse relationship, with some resentment for the security team occasionally mixed in. You may have seen the following tweet:
We believe things don’t have to be like that. In this post, we will discuss how we’ve implemented our Security Development Lifecycle (SDL) at Slack, our lessons learned, and the tool we’ve built — and are open sourcing — that makes it all possible!
A Bit of Background
Slack is a rapidly growing company. We have over six million daily active users and over nine million weekly active users. We’re still relatively young, so this reflects a rapid acceleration of people using Slack. To deliver features to a growing number of users, we’ve also grown as an organization. At the start of 2015, Slack had 100 employees. Today, we’re over 800 people! These teams are spread over multiple offices and multiple countries, and our Engineering organization has been growing at a faster rate than our Product Security team.
We have a culture of continuous integration and continuous deployment. The process of deploying code to production is very simple, and takes about ten minutes total. This results in a life cycle in which we deploy code to production approximately 100 times per day. We’ve built our SDL process to provide coverage without being blocking, and to emphasize self-sufficiency rather than having our team manually review every pull request.
Ideally our process isn’t this:
But rather, this:
Deploying the SDL at Slack
“SDL” stands for “Security Development Lifecycle”. You may have seen the following image:
Microsoft uses this process to highlight security topics throughout numerous development stages and phases. But when talking about secure development, mentioning “life cycle” or “process” can cause people’s eyes to glaze over. The perception of “Adding Friction” doesn’t work well with the “Always Shipping” mantra.
To deploy the SDL at Slack, we had to account for a variety of constraints. We have a lot of work to do, and not a lot of time to do it (this isn’t a unique problem, of course). While we have excellent developers, inspiring secure design is sometimes difficult, and we wanted to avoid impeding a team’s output. A traditional waterfall approach to the SDL wouldn’t work in our process.
We wanted to maintain our culture of developer trust. Empathy is an important value at Slack, and we apply that in our security procedures. An adversarial relationship between the security and development teams is a dangerous possibility if the security team is not thoughtful. Our goal was to empower the development team to be self-sufficient; security is everyone’s responsibility.
As part of our culture of trust, we place a huge emphasis on transparency and availability. We keep our discussions in an open forum, we host weekly office hours for anyone who wants to stop by and chat about security, and we offer security training during employee onboarding, to both educate and present the faces of the security team. All of these serve to foster mutual trust and collaboration. We wanted a process that was understandable, and had a visible scope from the outside. We wanted something that moved power back to our developers, who are awesome and care a great deal about their work. Developers understand best what they’re building, and they know where the risk is. We use Slack to keep all our discussions in the open and give important stakeholders a voice.
The Tool: goSDL
With all of this in mind, we created goSDL, a tool that brings all these concepts together, enabling our developers to produce secure features at high output with low friction. The tool (which you can find here: https://github.com/slackhq/goSDL) is a web application that guides anyone involved with a new feature, like developers or PMs, through questions and checklists to improve the security posture of whatever they’re working on. The name is derived from the process of initiating a feature review — a developer uses a slash command in Slack, ‘/go sdl’, to begin the SDL process (the app is written in PHP, not Go 🤷).
The process starts out with some simple questions. Based upon the user’s responses, an initial risk ranking can be easily determined.
The Initial Risk Assessment
The Initial Risk Assessment allows a team to estimate their feature or product’s risk before needing involvement from the security team. Developers have insight into their own codebase; they know if the code or feature might impact a sensitive part of the code, such as authentication or authorization functionality. The risk ranking helps inform the level of involvement needed from product security.
The Component Survey
After the initial risk level is determined there are further questions specific to the components of the feature being developed. The Component Survey allows the developers creating the software to expertly scope the survey to their needs. By default, it gives an array of “opt-in” content, allowing the questions to be concise and specifically tailored to the team’s work. Components use a plugin-based architecture not specific to Slack’s infrastructure. This approach allows generalization of questions and content, and easy extensibility. If support for a new component, like a new language, is needed, it’s as simple as adding another JSON plugin. This encourages thinking about security at a higher level than just the code itself.
Each component consists of a primary module, and potential submodules. For example, a feature may include a “WebApp” component. This would be stored as a module, and issues from the OWASP Top 10 could be included as submodules. We’ve included example modules and submodules here.
After the Component Survey is completed, the tool generates checklists for the person working through the SDL. Our approach to checklists is partially inspired by their use in preventing aviation accidents. The vast majority of air accidents and crashes are a result of human error. Many accidents could be avoided by properly adhering to checklists. Aviation checklists aim to enforce safety in the skies, and we employ them in a similar manner to enforce security in our code.
Crash investigations attempt to identify causes of accidents and prevent them in the future. We have added the completion of the SDL and associated checklists to our product team’s launch requirements list. We have simplified tasks as much as possible, presenting the items as statements that require little to no prior context to complete, and we acquire feedback via our bug bounty program and incident investigations that help us to improve the checklists as we progress.
When the Component Survey portion of goSDL is complete, two JIRA tickets are created for easy task tracking. One item contains the checklist for the team to complete as they build their feature, and one item is assigned to the product security team, allowing us to track our own review of the feature.
We are using the Checklist for Jira plugin to enable the checklist custom fields in JIRA tickets, and the ScriptRunner for Jira plugin to create custom a REST API which is then used by goSDL to update the checklist field.
Within the JIRA ticket, there will be checklists that are populated based on the responses to the component survey question. When the majority of items are marked as done, we reach out to the development team to address any outstanding unchecked items (and find out if an item was unclear or if the team is still working on it).
As mentioned earlier, two JIRA tickets are created by the tool. The second is assigned to the product security team. Upon its creation, we receive a Slack message informing us of the new review.
This JIRA review task gives us a heads up of incoming features and their risk ratings. This task item contains all the information we need to perform a security review:
- Technical/design specifications, which might illuminate other product areas worth exploring
- Contacts of the developer(s) involved in the project
- Slack channels related to the project
- Link to the SDL checklist, the first JIRA item
- Link to a relevant pull request
This provides the product security team with context as we work with the development team on the review.
To maintain transparent communication, we use multiple Slack channels to discuss features and reviews with developers. Product security has a triage channel in which anyone can consult our team, and every new feature has its own channel (e.g. “#feat-awesome”). This documents progress and builds knowledge as we work through development processes.
These approaches are vital for successful, secure development. We enable development teams to be thoughtful about the security implications of their features, and we provide a process for that via the checklists. We guide developers to build securely through the completion of the SDL. We maintain channels for transparent, quick, and interactive communication; this encourages participation, while also providing a searchable system of record for all future work.
This system has been an invaluable means of gathering feedback about our security processes, as well as our successes and failures. We gain high-level perspective of the software development trends taking place internally at Slack, as well as areas of particular concern to security. Coupled with external feedback (discussed shortly), we have both internal and external feedback which helps us improve.
We have, on average, between five and ten developers reach out to us each day (often before an SDL or feature is even underway). This gives us a degree of assurance that we are maintaining transparency and availability. Sometimes, developers or teams will reach out with questions about one particular item on a checklist. If multiple individuals reach out, we learn that we should tweak that checklist (e.g. to be more clear or more relevant). We have also received suggestions to add or update more related content for components or checklists. We then create or update content, allowing other teams to quickly benefit from that feedback.
Components and checklists are easy to modify or expand. They are all JSON descriptions, which follow a simple and readable format. We have examples in the Github repository, and adding content is as simple as creating a JSON file.
Developers are smart, and they care about the product. They want to contribute to security, but don’t always have the same security expertise as a more specialized engineer. The checklist approach helps to promote security consideration during the development process, and helps passionate individuals improve their own security knowledge. Here are a few examples of feedback we’ve received from developers about the SDL:
“[The SDL] made me think of security things I normally don’t think about.”
“Having my team see me complete the SDL was useful and made me think through the things I was marking as complete.”
“Upon going over [the SDL], we discovered there were a few things we needed to look at… Wonderful.”
This feedback was unsolicited (😍), and tells us that the process is helping developers as they proceed.
Slack runs a bug bounty program, which allows us to get constant feedback on opportunities for improving our product’s security. The program surfaces some truly clever bugs, and provides an excellent source of evaluation of our security, especially when we find patterns in submissions. We’ve noticed that after we release a new feature, we get more activity on that feature from the bug bounty. This makes the bug bounty an excellent source of external feedback for our SDL implementation.
The number of valid bugs that we receive per month has remained fairly constant, while the number of features released (and SDLs completed) has been gradually increasing. Some months have fewer SDLs than average, but this varies based on when feature work begins and is completed.
When we plot the number of valid bugs found by our bug bounty versus the number of SDLs completed by our engineering team, we see that our ratio of bug bounty bugs to new features is improving. Feature development is accelerating, while our valid bug bounty bug submissions are decelerating. There are other factors at play here, but in general these are desirable trends which point to the success of our SDL process.
First, we owe a big thank you to everyone who has participated in Slack’s ongoing security mission: the developers who build and secure our features, the researchers who submit bugs to our bug bounty, and past and present team members who have helped us build this process! We’ve come a long way, and there is still farther to go. By open-sourcing goSDL, we hope to enable other growing organizations to scale their security. We also hope to learn from their experience; we welcome contributions to the tool, its modules, and its checklists, and are excited to see what pull requests will come in!