O'Reilly data czar's advice for the enterprise - The Signal
Product Foundations

O’Reilly’s Chief Data Scientist has advice for data in the enterprise

Parker Tarun

Last year, the “data scientist” title was all the rage. Everyone was hiring them, or at least talking about it. Never mind that companies couldn’t always define how such a role would fit in their organization.

This year, things have shifted. The hunger for data professionals hasn’t gone away, but companies have become more self-aware about their data diets. Recently speaking with a head of data science at Fitbit, I was told that the department was trying to balance out some of its data scientists with more analysts and engineers. They were still figuring out the right ratio.

Little industry corrections like this betray a tension: Our ability to do innovate with data has leapt forward. Our organizational capacity to support this innovation has lagged.

Ben Lorica, Chief Data Scientist at O’Reilly Media, clarified this for me. But he also offered optimism for the new year.

Ben’s in the business of communicating with high-functioning data geeks about high-functioning data topics like deep learning and smart cities. He also knows how to speak to the enterprise. As the challenges facing data become increasingly organizational, the competitive differentiation for companies in the near-term will be communicating value effectively, from data teams to business leaders and then back again.

Ben signaled three trends that could ensure this paradigm shift in the enterprise: enabling line of business people to make data-driven decisions, emphasizing interpretability in complex data initiatives, and effectively communicating what enterprises love—ROI.

2017’s metatrend

Ben’s something of a data whisperer, a go-between for so-called geeks (his terminology) and the executives of the world. In fact, his function isn’t so different from the O’Reilly brand itself: To be on the cutting edge, while still eminently recognizable to business leaders.

Every year on O’Reilly Radar, Ben compiles a list of data trends for the new year. Some of his predictions may seem matter-of-fact, but they come stamped with an additional note of confidence: Over the course of the year, Ben fleshes out his hypotheses, hitting the convention circuit and meeting the people who matter.

Compare Ben’s trends on O’Reilly Radar for 2016, and his updated list for 2017, and you see an interesting metatrend: Last year was largely about accelerations the technical realm. This year leans heavily on how companies (and not specialists) will think about data and adopt its applications.

“The space of big data, and data science in particular, has undergone some maturation,” Ben said. “The technologies themselves are much more widely available and you have many more choices. You can cobble together your own solutions on-prem, or go onto the cloud and pick a managed service. ”

As a result of this maturation, Ben’s trends contain as much insight for non-technical personnel as they do for his core audience. This is but another symptom of big data’s recent conquests. One day, we’ll all be analysts of some sort.

That doesn’t mean the maturation is a passive adaptation. Quite the opposite. Business leaders need to be actively thinking about how to incorporate data into their organization. And specialists need to return the favor.

“In many ways,” Ben said, “the focus is starting to be on how to actually use and adopt these technologies.”

Mass consumption…

When asked about Trend #5 (“The democratization of data: Simpler tools will simplify many tasks”), Ben elaborated in a way that was coy for its understatement: “Most people within companies aspire to make decisions backed by something. Usually data.”

In the heart of San Francisco, this feels obvious. Nevertheless, it bears saying. There’s always anxiety around decision-making in a enterprise setting, and data can mitigate this. Companies want to go from arguing, or what Steven Sinofsky once called “testosterone-based engineering,” to fact-based reasoning. They just might not know how yet.

Let’s face it. Teaching everyone to build ML models just can’t scale. Ipso facto, simpler tools will be key to migrating everyone to a fact-based approach.

“The line of business, front-line people, need to be able to make decisions using data,” Ben said. “As the tools that you need to build an end-to-end data pipeline become more accessible, many of the routine tasks might be possible for domain experts—the people right there making the decisions.”

This is an important distinction. While non-data geeks may seem fairly unsophisticated in the context of data-driven decision-making, we can’t forget why they have their job title in the first place. Domain knowledge makes them uniquely suited to solve their own problems. What they presently lack is the ability to get the necessary information, not the instincts to decide what should be done with it.

Ben feels this will change with better tools.

“The goal is to have everyone be able to consume this data,” he said. “Think about a marketing organization. In the past, you might have employed statisticians and machine learning people to do the modeling. That probably will still be the case, but maybe your marketing analyst can do much more now.”

In a perfect world, this scenario sees line of business folks pulling and understanding the numbers that matter to them with no friction. The wonkier “data geeks” build experiments that can 10x their usefulness, instead of doing more menial work.

“As more of these tools get pushed down to the organization, more and more people are going to make data driven decisions.”

…and concentrated production

But it goes both ways. The first organizational shift is for line-of-business players to get easier access to data and to be able to autonomously address their own questions. The echo of that is for data organizations to target the business problems with the most amount of leverage.

Ben used machine learning as an example of a concept that hasn’t been fully capitalized on yet. Even when the technology’s there, the confidence sometimes isn’t.

“The main challenge for machine learning right now is just interfacing with society,” Ben said. “Part of it is the ethics and fairness around machine learning, but then on a much more routine basis, there’s interpretability and explainability.”

Often times enterprises have invested in hiring machine learning geniuses, while not exactly understanding their investments. It’s the responsibility of data scientists (and data teams) to evangelize their importance by explaining the value. A dialogue needs to open up between the data geeks and the head-scratchers.

“While the new techniques and technologies are cool, it’s important to emphasize their potential impact on the underlying business,” Ben said. “Hopefully, business managers will engage with data professionals who can help identify problems where big data and data science could make a great difference.”

None of this is to say the successful implementation of machine learning into large enterprises will be easy. It’s a push-pull of finding out where the resources can actually provide leverage and understanding where those business points of pain are.

Once enterprises are able to tackle those organizational problems, the headier stuff Ben’s invested in will likely accelerate—smart cities and artificial intelligence.

The enterprise’s road to more actionable insights can look daunting. But its north star should actually be a familiar one for business leaders,

“The key for data in the enterprise,” Ben said, “is to identify good business problems to tackle.”

For enterprises, which have the resources to invest in the best tools and talent for analytics, organizing the team is going to be as important as organizing the data. This will mean enabling line of business leaders to make data-driven decisions with simpler tools, emphasizing interpretability in the geekier departments, and creating a strong feedback loop between the data capabilities and the business needs.

Taken together, these three initiatives will entail a lot of work for enterprises upfront, but hey—it’s not as dense as an O’Reilly book.

ben ctt


Strata + Hadoop World is a rich learning experience at the intersection of data science and business. Thousands of innovators, leaders, and practitioners gather to develop new skills, share best practices, and discover how tools and technologies are evolving to meet new challenges. Find out how big data, machine learning, and analytics are changing not only business, but society itself at Strata + Hadoop World, March 14-16, 2017. Save 20% on most passes with discount code SIGNAL20.


Get the latest from Mixpanel
This field is required.