Picture this: it’s the fall of 2019 at the Forrester Data Strategy Conference in Austin. An almost unthinkably distant past filled with live sports, cheaper Zoom stock, and an absence of phrases like “in these uncertain times.” 

On the last day of the conference, Dr. Jennifer Bellisent is giving a talk about the rise of external data sourcing. A man in the crowd stands up and confesses that his team recently spent over a million dollars on a data contract only to find out it had already been bought by another department. Oops. 

This sort of thing isn’t an outlier, it’s a widespread problem. And it’s a symptom of a larger issue that companies need to take very seriously. Simply put, organizations almost never apply the same management rigor to third-party data as they do their internal data.

Mismanaging external data is costly

The rationale for bringing in outside data is solid. You can reduce your blind spots by looking outside your own four walls. Teams bring in outside data for the same reason they hire consultancies: to challenge or affirm their strategies (and if they’re lucky, to uncover new opportunities). 

But acquiring the right data is only part of the solution. Knowing what you have, showing people how to use it, and making it easier to work with is vital. Unfortunately, teams too often treat the management of external data as an afterthought. Here are three reasons why this is a serious mistake: 

  1. Cost: Third-party data often only gets used by the department that brought it in, but other teams may find it useful as well. Limited awareness and uncertainty, in this case, can lead to duplicate purchases and limited ROI.
  2. Context: Data isn’t useful unless it’s presented with info about where it came from, what it contains, and how it can be used. Teams often disregard the importance of context, which leads to challenges with trust and understanding. 
  3. Governance: Privacy considerations, licensing requirements, and contract compliance are daunting, but you can’t ignore them. One group might be able to use a dataset for product research, but not marketing… and another might be able to use it for sales strategy, but only at an aggregated level. Leaving external data unmanaged means that it’s only a matter of time before a governance crisis arises. 

How you should manage third-party data 

First, talk with your procurement teams to see if their asset management systems track data usage the same way they track software installations. This might be something they’re on top of already, but chances are they’ll be interested to find an overlooked cost center that they can help with. You’d be surprised how many otherwise sophisticated companies rely on message boards, email lists, or local knowledge to suss out whether or not specific external datasets are already in-house. 

Beyond that, find the forward-thinking people that care about this problem and drum up support. Look for the department heads that would love to get access to useful data for new projects and research; a good sign is if certain teams use open or paid data to augment their work. And talk with the progressive people on your governance team who think that true stewardship is just as much about increasing data availability as it is instilling policies to restrict improper use. 

The final step in managing your external data properly is to implement a catalog that can appeal to technical and non-technical users, lets you get hands-on with the data, and captures context in an intuitive way. 

If you’re already using data.world for your internal data, it’s just as easy to bring in and catalog third-party data. We’ll forgive you for using another catalog, but make sure to ask your provider how they handle external data. If you don’t like the answer, give us a call and we can help. As the only cloud-native data catalog, we can get your external data under management in weeks, not years.

Looking for a win this quarter? Drop us a line. We live and breathe this stuff.