My first job out of school was a cool-sounding robotics role at a startup in Boston. When I started work as robotic systems analyst it turned out a lot of my job was just figuring out where to find the right data in various databases. I learned this thing called “SQL” and ended up doing a fair amount of data gophering (ie. digging for data for other people). One of my first tasks was to pull down all of a robotic system’s configurations from the database and try to line it up and audit the configurations by comparing each of our deployments. When I was done, I handed over the report to the support team to help finish the analysis. I ended up doing this same analysis a few times for new deployments and saved my code so it would be easy for me to reproduce.
Eventually, I realized that instead of just fetching and rerunning this query each time, I could just build a tool people could use to do the analysis for themselves. So we setup a webserver, automated the analysis and query, put it in a tool and even put a search box on it and allowed everyone to see how different configurations were set across all our deployments. Every time we sold a new robotic system and put it online, configurations for the new deployment magically showed up in the tool. I had built my first data product! Now, people could get 99% of the value out of my manual reports and I didn’t have to lift a finger.
Not only did this tool save me time, but now the support organization could go and compare configurations across all our deployments with no cost, so they ended up doing it way more often. We built in features to flag configuration discrepancies and in the process moved towards even automating some insights as well as just the raw information. People would check the tool throughout the day, dozens of times per week — clearly this value would not have been possible without automation.
I see the process of pushing for data products as a process of pushing towards automation, pushing towards building software and pushing work that used to be high marginal cost (eg. ad-hoc requests) towards work with low marginal cost (eg. systems and tools that can be consumed).
Data as a Service => Data as a product
Data products and marginal cost
The main value of treating data as a product is delivering tools that can be consumed over time instead of just delivering one-off insights (or worse, one-off raw data). If your culture is centered on Data as a Service (the service desk model), you are treating the data team like a low-leverage vocation where the members of the data team’s primary function is to be the human interface to the database. Data people are expensive and there is a much higher-leverage way to think about this team!
Thinking like a product delivery organization helps you think about how you can deliver repeatable, high-leverage (and low marginal cost) insights and decision support that allow your organization to be data driven without the data team being involved in each decision or insight — or at the very least automating some of the work that the team does.
Here are a few things to consider before you get started:
- Evaluate your organization: is data as a product a natural fit?
- Product skills and hiring
- Establish a product roadmap and lifecycle
1. Evaluate your organization
To get to data as a product in your culture, you’re going to have to convince your organization that it’s worth investing in reproducible tools and processes that deliver value over time. There is some up-front investment in these tools, but they pay dividends (like all software).
If you work at a tech company that is engineering or product-driven, this might be an easy sell because this is how digital products are broadly developed. Your engineering team is probably focused on tools, investing in infrastructure, CI, testing and observability. Data as a product could be an easy sell.
If you are not working in that type of organization, it may take some partnership to convince your team to give you the space to develop data products. Contextualize what type of organization you’re in and if data as a product is a natural fit — you should approach selling DaaP differently based on the context.
2. Product skills
Product skills and entrepreneurship can be learned and taught. While organizational buy-in is important, it’s also important to recognize that product skills across the data function are important for implementing Data as a Product. As an aside, I think you should hire for product skills from the beginning, and help mentor and teach this skill on your team.
But how do you actually learn product skills? One option is just hiring a product manager or data product manager. There’s also a lot of reading and research you can do. I think The Lean Startup is old-but-good because it emphasizes the value of building incrementally / agile development. There’s a lot of good advice in the now decades-old Good Product Manager / Bad Product Manager which emphasizes researching and learning your company’s market and customers deeply. This will help you empathize with all of the teams that are aligned on delivering direct customer value (in case that isn’t the data team). There’s a lot that I’m missing here and would love your references if you have any specific product resources for data teams.
Another way to improve product skills is by practicing. We’ll go into the product delivery lifecycle briefly below. Going from listening to users’ problems, digging in a bit to their context, figuring out solutions to delivering products is great practice. You’re learning product skills by doing product work.
3. Product lifecycle and the hack to get started with DaaP
Ad-hoc data requests are the perfect place to start the process of building a data product. My hack is to treat these ad-hoc requests like prototypes for a potential data product. Each of these ad-hoc requests are signal that your organization wants and needs data — and there might be an opportunity to productize the value generated from this request and get deeper into the business value by getting more context on the request.
These ad-hoc prototypes may be worth scaling into a data product. Put your product manager hat on — if you tried to deliver this internally as a product, would people use it? Would it drive value for the organization? You know this person wants this data — will they want it again? Do you have confidence they’ll actually use the data in the first place? Do other people want this type of data? What insight are they trying to get out of it? Is it worth automating this? Can the organization get much more value if we turn this into a tool / product? Is the value of building and maintaining this product greater than the additional ad-hoc requests you would have otherwise done (the opportunity cost)?
The answers may be “no”, not worth prioritizing. A reasonable outcome when you start treating your service work as a prototype is to try to have a balanced portfolio of short and long-term work in your data work. You can start out by carving out a third of your time and roadmap for working on ad-hoc requests, a third of your time working proactively on the data products and a third of your time on maintenance and support for existing products. This, of course is just an example of how you can start to develop a portfolio approach to your data product roadmap.