2017 was the year cryptocurrency went mainstream. Amid the growing pains, as digital currency empires rose and fell, Coinbase has stood up for stability and order. As Director of Data Science and Risk at Coinbase, Soups’ responsibility is to make sure Coinbase remains the most trusted cryptocurrency trading platform.
Mixpanel’s Christine Deakers sat down with Soups to discuss how Coinbase positions themselves as the white knights of crypto, how they use machine learning to detect scammers, and the state of blockchain analytics.
For those who couldn’t make the event or want a quick recap, we’ve shared the four key points we took away from our conversation with Soups.
You can also watch the full discussion here.
In the Wild West of Crypto, being the good guys isn’t just good; it’s good business
Coinbase’s reputation is what separates it from the competition in the crypto trading space. That reputation is built on security, reliability, and transparency. Coinbase has employed a white knight strategy from a regulatory point of view.
While others tout crypto’s anonymity as a crucial feature, Soups points out that because blockchain operates on a distributed public ledger, it is in fact possible to see how funds are moving, a transparency advantage not shared by fiat currency.
Using machine learning to fight cybercrime
Coinbase stores tens of billions of dollars’ worth of cryptocurrency, so their first priority is keeping that money secure. As Director of Data Science & Risk, one of Soups’ top priorities is ensuring Coinbase users are protected from as many risks as possible. They are using machine learning in multiple different fronts: preventing payment fraud, detecting illegal usage of cryptocurrency from a compliance point of view and preventing account takeovers.
Coinbase has built an anomaly detection system to prevent account takeovers — in the same way your credit card company might notify you of a purchase when you’re traveling abroad, Coinbase is constantly looking for anomalous behavior to flag. Soups shared some interesting techniques that scammers use to perpetrate payment fraud. In one particular case, their machine learning rooted out an attempt at fraud by noticing a slight discrepancy in the screen size of the offenders.
How he hires a data science team
“Data science” has become a term that means a million things to a million different people. So when it comes time for Soups to hire “data scientists,” he has to figure out a way to cut through the noise to figure out what a person’s actual skill set is. To do so, he gives interviewees 100 points to play with in building a histogram of their skills in key data science categories such as SQL, data insights, distributed systems, backend software development, and machine learning, and then another one about what skills they want to develop to know what their trajectory will look like.
Per Soups: “If someone does SQL plus data insights, they’re very likely data analysts. If someone does SQL, data insights, stats, then they’re a data scientist. If someone does some stats and heavy machine learning, then they’re a machine learning engineer. If someone does machine learning and backed software development, then they’re a machine learning platform engineer. If someone does MapReduce distributed systems and some backend development, then they are a data engineer.”
Data science is data cleaning
Soups offers one key piece of advice to any data scientist: “Don’t be afraid of messy data.” The real world isn’t going to offer clean datasets, so learning to turn a dataset into something clear that can offer insights is a crucial part of any data scientist’s toolkit. Avoiding “garbage in, garbage out,” can be slow, painstaking work. But for all the exciting possibilities machine learning, data science, and artificial intelligence present, none of them are worth anything without a good dataset.
For Soups and his team at Coinbase, data science means using data to build out the most trustworthy platform possible. In the emerging blockchain space, stability stands out, and in order to build a reliable platform for end users, it requires good internal processes.
To watch the full conversation from Soups Ranjan at Office Hours, you can also watch the discussion here.