We hear it over and over: companies are increasingly looking outside their own four walls for data. This is prompting organizations to expand existing data sourcing teams or start them for the first time. The benefits are clear: third party data can help you anticipate market changes and find new opportunities.  

But throwing people and money at a problem to build competency isn’t sufficient. If you’re serious about bringing in the right outside data then it’s crucial to develop best practices that reflect the unique nature of data acquired from vendors. 

Data acquisition teams tend to be set up one of two ways: 

  1. Someone from the business side is paired with a data scientist to evaluate outside data. The businessperson thoroughly understands the use case and interfaces directly with vendors. They provide the sample data to the data scientist/engineer for testing. The advantage of this approach is that leadership can assign these duties to existing employees vs. hiring new ones. 
  2. Another way to acquire data is to combine the above skills into a single role. We see this more often at higher-level roles (VP of Data Partnerships, Director of Data Acquisition, etc.) or on mature teams that have dedicated data hunters. 


Here are three telltale signs that your data sourcing team is humming along: 

They have a relentless focus on prioritization

Most data hunters tell us their biggest limitation is personal bandwidth. There’s no shortage of ideas to test, but it’s hard to find time for it all (especially when they’re asked to pause evaluations to work on other internal projects). Recommendation: Let the data hunters focus exclusively on this problem; resist the temptation to reassign them to fill gaps on other short-term projects.

They retain all vendor context

Data is only as good as the context surrounding it, so it’s crucial to extend your data management mindset to vendors. Where are the limits? How should this data NOT be used? Do you and the vendor use the same word to describe different phenomena in the data? These types of questions are frequently raised during testing but are all too often lost in email threads or call notes saved on a laptop. Recommendation: Develop a framework for capturing the knowledge that emerges during the sales cycle. Standardize the questions you ask vendors for a better apples-to-apples comparison when purchasing. 

They never forget why they purchased a dataset

Don’t keep your data subscriptions out of habit or precedence. Keep them because they’re actively improving the business or answering a critical question. It sounds simple, but at large companies, it can be easy to lose track of the original rationale for buying a dataset. Recommendation: Every time you begin a purchasing discussion, document the business problem it addresses and explain why you think this particular dataset solves it. For example, maybe you bought it for ten fields out of 200 available. Why did you purchase those specific ten?

Start incorporating these practices into your acquisition process, and save a lot of unnecessary headaches down the road. There are other things to do along the way, like making sure you have a catalog for the data, but these steps will help you build the right cultural foundation. 

To learn more about best practices for sourcing and managing external data, or if you want help finding certain datasets for your data catalog, send me a note: james.gray@data.world. I’d love to help.