You can often tell whether a company is going to be successful by the way it treats its data. Does it view data as a key asset that should be nurtured, questioned, and broadly adopted? Or does it see data as a security risk that must be tightly guarded and only available to the few scientific minds who can understand it?
Companies in the former category often see a strong working relationship between data producers and data consumers. Producers - data engineers, DBAs, ETL developers, data stewards, et al. - are responsible for managing the data infrastructure such as data warehouses/lakes, data pipelines, etc.
Data consumers analyze data to create BI dashboards and reports, run machine learning algorithms to answer business questions and make predictions. Data analysts, BI developers, and data scientists are examples of data consumers.
But there’s one data personality that’s left out in this mix.
Why you need a data product manager
Let’s look at a simple scenario. Say a data consumer requests data from a data producer. Now think about the following questions:
- Did the consumer communicate the correct message to the producer?
- Did the producer understand what the consumer wanted?
- Did the producer deliver the correct/precise results?
- Did the producer provide something repeatable, or was it one-off work?
You’ve heard of the famous 80/20 rule that says, “Most data scientists spend only 20 percent of their time on actual data analysis and 80 percent of their time finding, cleaning, and reorganizing huge amounts of data.” This inefficient, repetitive process actually happens after the producer delivers data to the consumer. This is work data scientists don't want to do and you don’t want to pay them to do.
Clearly there’s a gap between the data producers and data consumers that can impede business insights. This is where a data product manager comes in.
What does a data product manager do?
Let’s step back for a moment and make an analogy with software. A product manager serves as a bridge between the software engineers (i.e. producers) and the software users (i.e. consumers) to make sure that the software (i.e. data) satisfies the requirements of the users.
Now think about how that role can work for data-driven organizations in the form of a data product manager. This role should understand the ecosystem of people and data, and tasks in an organization that need to be addressed with the data. It is crucial to understand and document the requirements and knowledge from both data producers and consumers in order to generate and manage clean, reliable and meaningful data. The data product manager should be responsible for the data.
Data wrangling and cleaning tasks should fall under data product management. This is much more than eliminating white spaces, replacing wrong characters, and normalizing dates. This is about:
- debating and documenting with stakeholders about the definition of an “order” or “net sales”
- defining the schemas and models of what the data means
- grounding the meaning in the disparate data sources
- maintaining a data catalog
- applying an agile methodology to deliver reliable data without boiling the ocean
This is what I call knowledge science. And like any good scientist, this role should constantly experiment and measure. One way to measure success is how well the assets under management are adopted. How much are they used? How are they driving ROI?
Ask yourself, who is responsible in your organization for creating and managing reliable data that can then be used effectively in a data-driven organization? Who is the Data Product Manager in your organization?