How to interview your first data hire
I often see founders trying to optimize precise role definitions for hiring the first person onto their data team. This is reasonable because “Data” can mean a lot of things: do you want someone to build Machine Learning models? A data engineer to setup ETL jobs? Someone to help run experiments and A/B tests? Someone who is good at visualization? The answer is probably “yes” to a lot of these— so like with many startup roles I would optimize on having someone with 1) a good product sense and 2) a good breadth and understanding of lot of different disciplines. They should be a fast learner and eager to get deeper in disciplines as the company realizes they are useful / necessary (and help hire experts once the requirements are beyond their capabilities).
Note that this advice is geared towards startups that have a lot of uncertainty and which could go in a lot of different directions. If you have a very specific idea of tools that will be necessary you will clearly want to emphasize these in the interview loop. I would be careful here though and check your certainty — I’ve talked to founders that were sure they needed experts in recommendation systems although the company hadn’t organized any data for this recommendations expert to work on. This startup would probably be best served by starting with a generalist.
I’ll break down four broad disciplines that I think you should interview for in your first generalist data hire. I’m personally not a fan of trivia-style interviews where you ask about knowledge of specific technologies (eg. having a Kafka and streaming interview). For each discipline I try to give a sense for the minimum requirement.
Methods and Statistics
Statistical thinking is important for your first data hire even if you don’t expect to be running and designing A/B tests right from the beginning. No matter your domain, it’s probably helpful to have some basic statistical thinking: e.g. how much data do you need from users before you have some certainty that you’ve made a good decision in a product? Statistics can also be a tremendous asset early on, but to be honest, in most contexts I actually think this is likely the least important skill that we list here. The priority of this skill could evolve as the team grows.
Coding
The data team should be able to code. Coding is fundamental to automating and if there is no automation mindset on your data team, you are going to be in trouble.
But, how well should they be able to code?
My bar is the ability to write procedural code to automate a task and the ability to optimize that code a bit (eg. pulling out functions and creating reasonable signatures for those functions). I’ve seen people bust out the Google-style algorithmic questions for their early data hires which I just don’t think is what most companies should prioritize.
Honestly, your first data hire should probably know SQL before they show up on day one… because, well that’s just obviously useful. If they don’t know SQL, I think it’s reasonable to ask them to learn the fundamentals for an interview. If they don’t know SQL I would also expect them to be very strong in other programming languages.
Data Modeling
When I say data modeling, I don’t mean predictive modeling, I mean data structuring. Meaning, they can structure data in a set of data tables to support answering analytical questions important to your organization.
For your early data team, a lot of the job will inevitably be organizing data into useful forms that can be used for analysis and driving action. Since this is a lot of the job, why not just structure an interview around it? Have the candidate structure a database to answer a series of example analytical questions. Once they’ve modeled the domain, have them answer the analytical questions with code.
Even if you’ve decided that your first data hire won’t do any data wrangling (and I’m skeptical of this decision) — being able to articulate what types of data structures they need will be an incredibly helpful skill.
Product and Analytics
I’ve already covered the importance of hiring for product skills in a previous post. Your bar for product skills should be very high. To test for product skills, have the candidate think through some actual problems they might have to solve in the organization and have them talk through how they would prioritize them.
We haven’t really covered analytical skills and I think product-oriented interviews are a great place to gauge a candidate’s analytical thinking as well. Case studies could include: analyzing user behavior, building out data products, prioritizing analytical features in your product or analyzing your operations. A great candidate will help you think through a real problem in your organization in a new way and give you an idea of how to approach them.