Client Consistency at Slack: Beyond Libslack


Photo by VanveenJF

Two years ago, I wrote a post about Libslack, Slack’s shared C++ client library. That post described how Slack used the Libslack library in its mobile applications to encapsulate shared business logic, and to handle syncing and caching of data.

In the intervening time, we decided to move away from using a shared C++ library in our clients, but we haven’t discussed that decision publicly. We were spurred to write an update when Dropbox published this post about why they also decided to stop using a C++ library in their mobile apps.

In this post, we’ll discuss the reasons why we decided to discontinue the Libslack project, and what we are doing in place of it. While we are no longer building a shared library, Slack still needs to maintain consistency and reduce duplication of effort, while developing separate implementations of client infrastructure. We will talk about some of the techniques we used to achieve that.

Why we stopped development of Libslack

Many of the drawbacks Dropbox experienced with their shared library rang true for Slack as well. As described in our previous post about Libslack, there were certainly benefits to sharing code between client applications — a shared library increases consistency of behavior and prevents duplication of similar business logic in every Slack client. However, there are also downsides to this approach.

When the Libslack project began, the initial plan was for it to be used by the desktop, iOS, Android, and Windows Phone clients. Due to conflicting caching strategies and difficulty integrating into the build system, the desktop client never adopted Libslack, and the WinPhone client was discontinued entirely. This brought the number of clients sharing Libslack down to only iOS and Android, which reduced the benefit. Many companies successfully using shared libraries architected their mobile clients with one in mind. Slack added Libslack when its mobile apps were already mature, so it was replacing existing functionality, and it had to fit into two different established architectures.

To further complicate matters, there was overhead to integrate the library into the build system for each client, to ensure debugging worked properly across language boundaries, and to sync up Libslack with releases for each of the clients. (While we were developing Libslack, our mobile clients shipped on different schedules; they now share release cycles and ship on the same dates). Coordinating quality engineering for the library was difficult, as was determining when and what to hotfix. Most mobile engineers at Slack were not familiar enough with C++ and the processes for building and debugging Libslack to help fix issues in the library. Also, as mentioned in Dropbox’s post, hiring engineers with C++ experience, particularly on mobile, is difficult, which would have made it difficult to grow and sustain the project.

In the end, Slack decided the overhead of developing the library outweighed the benefits, and we sunsetted the project, moving back to implementing Libslack’s functionality separately in each client application.

Life after Libslack

When the Libslack project started, iOS and Android engineers were organized into two large teams, each working on the entire app. By the time we stopped Libslack development, the mobile client teams had been split into pillars, where mobile engineers worked with frontend and backend engineers on specific feature areas. In addition to the feature teams, we added an infrastructure team for each mobile client, which was responsible for supporting data syncing and caching, among other things, and could take over the work done by Libslack. Most of the engineers from Libslack joined these infrastructure teams.

DataProviders

To handle the responsibility of syncing data for the application, both of the mobile infrastructure teams developed DataProvider frameworks, which ensure that all the data needed by the app is present and up-to-date. While they were developed separately on iOS and Android (written in Swift and Kotlin, respectively), these frameworks provide similar benefits and functionality. The DataProvider frameworks vend information about users, channels, messages, or other Slack objects to the rest of the app. When a query for data is made to the framework, it returns what is currently in the local cache, fetches updates if necessary, and notifies observers of changes as they come through. Objects from DataProviders are returned as immutable models. On iOS, where the app uses a CoreData cache, this means the rest of the app no longer needs to access mutable CoreData objects directly, which reduces the need to worry about concurrency issues and avoids the crashes due to accessing data on the wrong thread that are common with CoreData.

The DataProviders frameworks are additionally responsible for assembling model objects from multiple data sources, and for ensuring that they are consistent with the current user’s preferences. For instance, to supply user information to the app, we need basic properties like name and status, but also user presence and “do not disturb” settings, which have to be fetched from separate sources. The user’s display name is determined according to team and user preferences. To display a message, the client may need to resolve information about users or channels mentioned in that message, as well as all the custom emoji. This becomes even more complicated once we bring in messages from channels shared between different organizations. The DataProviders framework assembles all this information by sending queries to the Slack API and to Slack’s edge cache.

The DataProviders frameworks make certain that we update data efficiently, through mechanisms like data versioning, lazy loaders and session caching. The framework also vends APIs to allow the rest of the app to update the local cache when new information comes in, or if the user makes changes.

By creating these frameworks on both iOS and Android, we encapsulated the logic around accessing the cache and syncing data, and moved it out of the main application, just as we planned to with Libslack. This also gives us greater freedom in future to alter the way we store and sync data without having to make changes across the app.

Client Council

We have started other initiatives to reduce duplication of effort across Slack clients. One of them is our Client Foundations Council, a weekly meeting of engineers from infrastructure and foundational teams in each Slack client. In this meeting we discuss upcoming features and changes to Slack that will affect all the clients. We share knowledge about how each client works, and we try to develop common solutions to issues the clients are facing, to ensure consistency. We bring in feature teams and engineers that are seeking input on new proposals. The meeting has also given rise to new feature ideas which have improved the behavior of all Slack clients.

Conformance Suites

Additionally, the clients have been able to keep in sync by developing shared conformance suites for some features, particularly those where the rules for implementation are complicated and there are many possible edge cases. This allows us to ensure identical behavior without needing to enforce the exact same implementation. We make these suites by creating a JSON list of test cases, with expected inputs and outputs. Then each client can create a test harness to run the suite, and validate that we are getting the expected outputs on every platform. We have created conformance suites for features such as handling highlight words and link detection in messages. Other companies have used conformance suites to help third party clients conform to expected behavior (e.g. the conformance suites that Twitter has published to define how to parse tweets). An example from our highlight word conformance suite:

Finally, one of the chief ways we ensure consistency between our clients is through Slack itself! We have feature and team channels where the engineers working on a project can discuss proposals and hash out implementation details, and other engineers can follow along if interested. There are escalation channels, where engineers can see what issues have arisen and ensure they are fixed everywhere. Engineers can ask questions in public channels to get more eyes on them.

Conclusion

While Slack decided against using a common C++ library to share code across its clients, we still are striving to increase consistency of behavior across clients and reduce duplication of engineering effort. As the size of teams using Slack grows exponentially, so does the number of messages, channels, and custom emoji, and we have to be smarter about what data we sync and how often we update it. The infrastructure teams work together to ensure that Slack clients are kept up to date with the data that is most important to the user.



Source link