How to hire for data science and machine learning
Last Updated: April 7, 2020
"Data scientist" is one of the most misunderstood job titles, with its meaning varying wildly between organizations. One reason for this is that the term lumps together responsibilities which have more recently evolved into specialized roles: data analysts, data engineers, and machine learning engineers.
Some companies still need generalists who can deeply understand the business domain, set up cloud infrastructure, maintain data warehouses, process the data for analysis, and interpret model outputs.
Yet it is exceedingly rare to find (and retain) such generalists. The vast majority of companies would be better served hiring for more specialized roles, and finding candidates with slight overlaps (e.g. an expert at interpreting algorithms who can do basic infrastructure setup and maintenance if need be).
This guide primarily focuses on the competencies needed for the machine learning engineer role, the area where companies seem to struggle the most. Yet it also serves as a starting point for any related role, and highlights areas in the interview which can and should be customized for the different responsibilities.

Guide Creator:Reboot AI
www.reboot.aiReboot AI equips businesses for independence from AI consultants and outsourcing. We work with you to define, execute, and monetize an actionable data strategy and coach your teams through the delivery of real-world projects. Make data your competitive advantage by contacting us here
Criteria this guide covers
Interview GPS organizes questions by personal values, competencies, and skills. Click a criteria to explore more questions
Communication | Effectively communicates ideas to their manager, team, other internal/external stakeholders and others |
Critical Thinking | Analyzes information objectively and make a reasoned judgment |
Curiosity | Pushes to get to root causes, asks critical questions, and ‘takes things apart’ to see how they work |
Mental Agility | Thinks flexibly and takes in other perspectives and views |
Practical Thinking | Make practical, common sense decisions. Understands what is happening in a common sense way |
Data Science | Using scientific methods, processes, algorithms and systems to extract knowledge and insights from structured and unstructured data. |
Machine Learning | Using algorithms and statistical models to perform tasks without using explicit instructions, relying on patterns and inference instead. |
Secondary criteria: | Learning Collaboration Conflict Resolution Empathy |
Imagine I am your colleague in another department (ie sales). I have absolutely no knowledge at all of data science terminology or concepts. How would you explain your latest project to me?
Communication and Storytelling are vital to anyone working with data. The more senior the role, and the more collaboration expected of the role, the more true this becomes. At the end of the day, data only drives business value if the business acts on that data. That nearly always requires convincing stakeholders with a variety of backgrounds and levels of technical understanding. If the candidate is unable to bridge the gap between quantitative and qualitative minded-audiences, all their analytics will be worthless.
You work for a credit card company (ie VISA, AMEX, etc) which currently manually reviews every transaction to detect fraud. Your CEO wants to cut costs by automating fraud detection, but your CFO is worried about the cost of paying for each fraudulent transaction that goes undetected. How would you approach the problem?
The ability to abstract and breakdown complex systems with Conceptual Thinking and Critical Thinking are key aspects for working with data in any capacity. Candidates must be able to think through complex chains of cause and effect relationships, design experiments to fill in gaps in knowledge, and draw probabilistic inferences and conclusions. This is true for any role, though it takes slightly different shapes for each. For example with data engineers, the specific ability to think through architecture tradeoffs and implications is vital.
Describe the last technical project you worked on outside of your job responsibilities (as a passion or hobby).
Curiosity is arguably the single most important characteristic to hire for when it comes to data. If you don’t hire for curiosity, your data teams are likely to spend time on surface level tasks, such as simple data pull requests. They may never ask the questions necessary to get to root causes, and expose the non-obvious insights that drive real value. Curiosity is key to teams that ask the right questions: “how is this data valuable to you? And curious technical people are almost always working on some technical project in their own time, as a way to learn how something works (an industry, a tool, a technology, etc.)
Tell me about a time when you had to adjust to a colleague’s working style in order to complete a project or achieve your objectives.
Like Communication, Mental Agility and Empathy are also indispensable traits for data professionals. Anything can be viewed through a data lens… but not everyone naturally does so. But beyond just communicating with other points-of-view, data professionals must be able to simultaneously see and work with different points-of-view themselves. They must be able to not personally attach their egos to any one answer, and instead be empirical, skeptical, and follow the data. The best candidates hold beliefs as hypotheses and care more about ‘finding what is true’ than ‘being right’. Adapting to other people’s views or ways of working is a powerful example of mental agility.
How would you design the backend for Airbnb?
Perfection is the enemy of progress, nowhere is that more true than in data science. In fact, as I like to tell our clients, if your machine learning model is ‘perfect’, it’s a sure sign you’ve done something seriously wrong! Practical Thinking is a must for every role, because there are always tradeoffs to make between performance and precision. This is a great question because you can let candidates answer to the area they are familiar with: a data engineer will speak more to architecture while a data scientist will speak more to model structure and organization. Plus you can follow up by asking what they would build first. In each case, look for practical tradeoffs that prioritize getting something workable into production.
About the Author

Matt O’Connor is a data science and strategy coach, and the creator of the human+ai framework. He has led algorithmic investment for the world’s largest hedge fund, Bridgewater Associates, raised VC funding at 7-figure valuations, and coached some of the world’s largest companies on data and AI strategy and implementation.