A product engineering tale
Back in the heady days of early July 2017, I was tasked with adapting the Medium 💚 system into something that allowed variable input. With the binary 💚 button, users could tell us when they liked a story, but we wanted to know how much they liked it, in comparison with other stories.
In theory, this new data would allow us to surface truly good content over shiny clickbait. It was also the first piece of a larger project to open up our partner program and start paying Medium writers!
The Medium 💚 system worked like this.
On web, iOS, and Android, we had front-end components that rendered the button everywhere: on post listings, user profile pages, responses, topic pages, and in a few places on every post page. When you tapped it, it sent a 💚 request to our backend.
The request handler for 💚 updated a field on the
UserPostRelation table to indicate that this user had recommended this post, then emitted an event that fanned out to do many more things, like send push/activity feed/email notifications, update post stats, update author stats, send data to our recommendations pipeline, send data to our social graph database…
We decided to keep most of this infrastructure in place and patch our new multi-recommendation system into it as far upstream as possible, to minimize backend changes as we experimented on the front-end. Conveniently, we already had a field on
UserPostRelation that we were using to store how many times a user clapped for a Series.
In my project planning, I allocated two weeks (out of five) to design and prototyping experimentation. So while I built an ugly but functional button and connected it to our backend, Jess, Herzog, Peter, and Wolfe came up with all this cool stuff!
A week in, Tess did some user research. People thought the designs that looked like rating systems required too much cognitive effort, so we moved away from those. Everyone loved Herzog’s cute animations though, so they stayed.
Product questions 🤔
After our two weeks of fun, it came time to make some hard calls.
- What would the new icon be?
- How many times could a user recommend a single post?
- What number would we show next to the button — how many times it was recommended, or how many different people recommended it?
- Would the new action be only for our members?
Internal opinion was…varied. Some people wanted recommends to be capped as low as 3 per person per post. The prototype I released internally had a cap of 100. Some people wanted no cap at all, like claps in Series. For the number to show next to the button, we got an impressive quantum full house of preference: recommends, people, both, neither.
💚 felt like a great icon when the recommend system was binary, but nobody wanted to give 20 💚s, let alone 50. We considered 💡 (too erudite), 💧 (too divorced from sentiment), and 🎉 (too frivolous). Ev, who has the world’s driest sense of humor, half-jokingly (I think) suggested we should make a tree that grew bigger the more you recommended, and eventually produced knowledge-apples.
I, naturally bad at narrowing scope under the best of circumstances, had a stressful week. I coped by gathering advice from everyone in sight. Kathryn, our head of marketing, told me something really insightful: that internal opinion, though it seemed all over the map, was actually divided into two camps:
Camp 1 imagined that the new action would be lightweight and easy to perform, that each recommend would be a drop in an ocean, and that people would want to keep expressing their appreciation until it felt right.
Camp 2 thought the new action should be more weighty and meaningful, that each individual recommend should carry significant value, and that people would want to think carefully about exactly how many recommends they gave.
People’s preferences tended to align with their camp. I was staunchly Camp 1, so I wanted a high cap, I wanted an icon that felt like something you could give 5 or 15 or 40 of, and I wanted to show the total number of recommends (even if it was giant).
katie understood immediately, and further intuited that both camps were valid but the real danger lay in the lukewarm compromise in the middle. Within 12 hours, she put out an internal Medium post explaining the phenomenon and declaring that we were going with Camp 1: 👏 icon, capped at 50, showing total clap count, going out to everyone.
And we were re-aligned.
The data backfill 🚚
Since claps are stored on a different field in the
UserPostRelation table than recommends, we needed to backfill them from recommend data. Otherwise, every post written pre-👏 would suddenly show zero engagement.
I dedicated three cavalier sentences to this integral part of the project in my tech spec and handed it off to my Code2040 intern, Dmitri, on his second week.
I forgot about backfilling claps on author stats (which power our stats pages) and post stats (which are used everywhere). Each of these additional backfills was actually two backfills, one for summed totals and one for time-series data. The original
UserPostRelation backfill also turned out to be more complicated than I thought, spanning 30 million objects and requiring sharding.
Within a week, Dmitri had eclipsed me in knowledge of our events and stats pipelines. Within three weeks he’d finished writing code for all five backfills. He spent the remaining time on the project capacity-planning with the dev ops team and shepherding backfills through dev, staging, and production, all the while casually picking off tasks from all over the rest of the stack.
30 million 👏 for Dmitri, the most extraordinary intern in the universe.
Interesting bugs 🐛
This project involved keeping a lot of different kinds of data in sync, so, unsurprisingly, we spent a lot of time fielding reports of data bugs.
On our internal version of Medium, people would clap for a post, come back later, and their claps would be gone. Sometimes they saw errors in their browser. Sometimes their claps would appear in post totals, but their name wouldn’t show up in the list of applauders.
Once the backfills were complete, we started chasing down reports in earnest. We eventually untangled and fixed three separate bugs:
1. Rate limiting
We used to allow people to recommend 100 posts per day. In the vast majority of cases, people were hitting 💚 once per post, so we rate-limited per request. Once someone hit their limit for the day, we started sending back 429’s.
When we launched 👏 internally and before we implemented a reasonable batching strategy, people were suddenly hitting their daily limit after interacting with just two posts. Worse, we were updating the number of people who clapped (on post stats) and the number of claps a user gave (on
UserPostRelation) in different places in the code. The former was hitting the rate limit; the latter was not.
Kyle and Dmitri fixed this by changing the rate limit to only count a first clap, and reordering code so that even when the rate limit was hit, data wouldn’t get out of sync. I hurriedly implemented better batching.
2. Old mobile clients
We knew we were going to have a graduated rollout for 👏, so there would be some time when 💚 and 👏 would have to co-exist. We made sure 👏 was backward-compatible, but we separated 💚 and 👏 flows by user.
People who got 👏 early but still had old mobile clients with 💚 could briefly increment post totals from their apps without affecting their own
UserPostRelation clap count.
3. Deactivated users
This was maybe the weirdest bug we found. A whole bunch of posts showed up with 1 clap from -1 people. When we dug in, we found that all of their post stats had recent changes at the exact same timestamp.
Our data platforms team found the culprit: we recently deactivated the account of a prolific spammer, and somewhere in our data pipeline a non-idempotent event to decrement post stats got sent twice, leading to the -1 people counts.
We also weren’t decrementing claps on user deactivation, resulting in the single orphaned claps.
We rolled 👏 out to our iOS and web beta groups first, then to 10% of our users. Initial reactions were mixed, with the main complaint being that 👏 seemed pointless. But interestingly, quite a lot of people correctly guessed that they would become an important signal when Medium started to pay writers. Good work, product prophets!