Scaling without losing focus on meaningful metrics – how Twitch gets it rightPublished date: Sep 29, 2015
We’re in the cafeteria of Twitch’s San Francisco office. Screens are on every wall, streaming games of League of Legends, Counter-Strike, and DotA 2. Dance music emanates from unseen speakers. A friendly golden retriever is wandering our way, looking for some attention. None of this fazes Drew Harry. He’s focused – excited even – about documentation.
“I know it’s not everyone’s favorite thing,” he says. “But it’s important.”
It is. Particularly at a company growing at the rate that Twitch is. Last year they more than doubled their monthly viewers – from 45 million to 100 million – and added a handful of new ways to watch, including PS4, Xbox One, and Chromecast apps. Educating everyone internally about how to find meaning in the midst of massive data and user growth has been a major challenge.
“We’ve brought on a dozen new product managers. If they show up and just see 150 events firing across 10 platforms, they’re not going to know what is going on.”
Twitch’s origins are in a website called Justin.tv, a streaming video platform that allowed anyone to create a live channel, broadcasting whatever their heart desired. Or at least that’s what it became. Justin.tv started as a guy named Justin – Justin Kan, now a partner at Y Combinator – strapping a cam to a hat and streaming his life.
In 2011 Emmett Shear, one of the co-founders of Justin.tv, launched a section devoted entirely to gaming broadcasts and called it Twitch.tv. By February of 2014 it had outpaced its parent, and the company rebranded as Twitch to focus entirely on streaming video gaming. In August Amazon acquired Twitch for $970m.
Drew is the Director of Science at Twitch, where he heads a small team of data scientists. For the data team, meaningful insights are built on high quality data sources that conform to clear definitions. And with a different product team for Twitch’s dozen platforms, the slightest ambiguity can result in confusion. Even for a user action as seemingly simple as “began watching video.”
With Twitch streaming on iPhones, Android, tablets, Xbox One, PS4, Roku, and a handful of others, you could see how things could get confusing, and how event documentation would be important for heading off as much chaos as possible.
Tracking when a user starts to watch a video is a crucial event in understanding how people are using Twitch. But what about on an Xbox when a video autoloads? Is that the same? If a video starts playing when you hover the mouse over it, is that the same as clicking through to a channel and watching?
“From day 1 we’ve focused on data to guide our decision-making process. Some of the time, this means classic data science and crunching the numbers on user behavior in aggregate across millions of actions. Sometimes it means interviewing 10 key example broadcasters to understand their views of the world. Sometimes it means market research on how competitor’s features are working.” – Emmett Shear, Twitch co-founder and CEO, to Data Science Weekly
How crowds behave online
Drew joined Twitch last year as a data scientist and became the team lead earlier this year. It’s his first data job, but what he’s working on at Twitch is a natural progression of his studies at MIT’s Media Lab, just more complex and at a much larger scale.
“Media Lab is a weird interdisciplinary place. It’s all about imagining the future through specific artifacts.”
Companies like Harmonix, the original developers of Guitar Hero, E Ink, makers of the electronic paper displays used by both the Kindle and Nook, and One Laptop per Child, have roots in the Media Lab.
“It’s one thing to sit there and think, ‘Wouldn’t it be great if this existed?’ It’s another to get it 10 or 15 percent of the way there. To actually show it could work.”
At Media Lab, Drew studied how a crowd of people behaved on the internet. He contributed papers like “4chan and /b/: An Analysis of Anonymity and Ephemerality in a Large Online Community” and worked on software for getting people to ask better questions at conferences – testing weird voting schemes to see how people would react.
“It’s a very satisfying process to make something, see how people use it, then build it better.”
There would be a couple hundred people at the conferences using his team’s software. On the day I showed up to Twitch there was somewhere around 500k people in one online room watching and chatting about a Counter Strike tournament. Twitch is what a stadium of people looks like online.
Becoming part of your media life
The long term goal at Twitch, what Drew and the Science team work everyday to create a better understanding of, is as simple as it is lofty. They want Twitch to be a part of your media life. How we are consuming media is changing. We’re all leaving traditional media to new experiences that are better suited for our digital world. And Twitch has carved out a sizable chunk of that new market. A 2014 study by DeepField showed that Twitch was the 4th largest site on the internet in terms of peak traffic. It only trailed Netflix, Google, and Apple. And they were ahead of other streaming outlets like Hulu and Amazon.
To reach that long term goal, they have a similar simple short term goal – to get people who care about games to come to Twitch, and for them to find something every time they come. Which sounds nice and all, but kinda boring, until you realize that they’re for real. There’s no activation process. Or, at least, it’s not what we all are used to when we think about activation.
As users find value in features like chatting and following, they’ll naturally decide to create an account, but Twitch isn’t ushering them into some funnel.
“There’s not some moment you ‘activate’ a user. There’s nothing like that. Our relationship with people is longer term and a bit smoother. We’re not trying to get you signed up through some signup process.”
Millions of their users aren’t logged in. They don’t have an account. And Twitch doesn’t distract people from their primary object by forcing them into a traditional sign up process.
“We do qualitative research to understand how people are using Twitch, and I’ll be absolutely stunned by a regular viewer who has no mental model of how Twitch works. They just know when there’s a big Counter-Strike event, people will be tweeting about it, so they’ll show up to Twitch to watch.”
Large events are one primary discovery path for new Twitch viewers. It’s how many first arrive on the Twitch site, and it’s how many users casually find something to watch. As an event gains viewership, it rises towards the top of the Twitch homepage, which acts like a town square. It’s where users go to see what is happening on Twitch, and they’ll end up watching a game they hadn’t thought about in years.
The big communities stand out. People playing DotA2, StarCraft, League of Legends. But they’re less like islands and more like sister cities. People traverse from one community to another. The user base is very continuous; there are no clean clusters. It’s not as simple as “User A does this and User B does that.”
In this map, each circle is a specific channel on Twitch. The lines between channels represent the amount of overlap between the audiences of those channels; each time a specific viewer watched two different channels during the time period this data draws from, it makes the connection between those channels a little stronger. – Visual Mapping of Twitch and Our Communities, ‘Cause Science!
The One metric behind a successful Twitch experience
All of this means the data team tries to keep target actions general, encompassing a successful user experience when a visitor might not even log in. The metric they’ve settled on is watching a stream for 5 minutes. It’s far less than the 106 minutes that an average Twitch user will watch in a day. A drop in the bucket compared to the 20+ hours half of Twitch’s users will watch in a week. But it is the benchmark for a good Twitch experience. It means you were able to discover something interesting that you watched for a nontrivial amount of time.
Sure, nothing magical happens between the 299th and 300th second. Just picking a cutoff that works. It’s more important that your takeaways are directionally correct and sticky, than that they are overly precise.
“Success for Twitch is defined as watching a piece of content for longer than 5 minutes. A whopping 91% of Twitch’s minutes watched are on video views longer than 5 minutes. Also, there is strong evidence that viewers decide what they like quickly; 40% of follows happen in the first minute of a view. By the definition above, 46% of sessions succeed.” – Surfing the 4th Largest Stream of Data
High impact projects
This is what the data team does. A complicated chart or graph rarely makes an impression. They could send them every day, and it might look like they were accomplishing something, but if it’s not making an impact on how Twitch is thinking and operating, then what does it matter?
“We do a full-court marketing press when we learn something important. We’ll share our 8 page report, but then we’ll also send out a 3-sentence take away email, and we’ll condense it to a single slide for the company meeting.”
“We don’t want to be the oracle of the company, with people just coming to us and asking, “How many times does this happen?” and then we give them a number and they walk away. There’s no value in that.”
Drew wants someone from the team to be in the room when the questions are being asked. Someone who can look behind to see the heart of the question and determine if data can answer it.
“Data doesn’t have inherent value. Just because you can count it doesn’t mean it means something.”
He wants the team working on high impact projects, driving a real understanding throughout the company of how people are using Twitch, to answer questions about how Twitch grows and makes money.
“Data science is a hard thing to do well, and it’s not obvious when it’s done poorly. It requires a high level of trust.”
Precisely defined events
As Twitch’s offering becomes more diverse, figuring out the top-level metrics that answer those questions, and accurately tracking those actions becomes more important.
“The event tracking model is really, really powerful, and it’s easy to add across our platforms. You just code it in. When this happens, send this event with these properties.”
But that doesn’t mean tracking anything and everything. Or, at the very least, that everything you track is on an even field and expected across all platforms.
“The web team could fire a pageview event, but does that mean that we need a pageview event on the Xbox? Is there even a concept of ‘a pageview’ on the Xbox?”
For Drew, that means determining the user actions that actually matter, and being meticulous in defining how they are implemented.
“We have our tier one events that must fire on every platform. And it’s critical that we have super precise specifications.”
These tier one events are the backbone of Twitch’s analytics, company wide. Tier two is important, but not an emergency if a release breaks one. And then there’s the Wild West.
It’s most important for the top events to be correctly implemented, but the wild west gives the teams on different platforms the chance to be agile – to move fast and track anything in their specific platform that might provide a meaningful answer to immediate questions.
“It’s easy to think things are changing so quickly, there’s no value in writing it down. Or to think that it’s obvious and doesn’t need to be documented. It’s easy to figure it out, finish the project, and move on. That feels like velocity. It feels like you’re being an agile team. But you have to care about docs.”
A recent project had a teammate digging into language properties. Which is another thing that seems simple, but gets complex. What does language mean? The broadcast language? The browser language? The user settings language? What if they disagree?
“She figured it out, now we just need to write it down. And you know what? It’s so liberating when someone asks you something and you can just say, ‘Check the docs. I know they’re pretty good on this topic.”
That helps teams be autonomous. When everyone has a strict definition of what something means and what they are working towards.
And that’s ultimately the impact of the science team. To find the most efficient way to use resources to make progress in the things the organization cares most about.
“If people don’t see our work and its relevance to them, then what’s the point? There’s no virtue to being clever in the corner.”