What License Should I Use for my Data?

by | Nov 7, 2017 | Data community

If your data does not have any license terms, that means you retain all rights, and you do not authorize anyone to use, copy, distribute, share, combine, or make to changes or derivative works from it.

The more open a license is, the higher the chance that others will use your data and recognize you for your work as a proponent of open data. Check out data.world’s member@jaredfern, an NLP researcher at Northwestern, who published several of his datasets under the Public Domain. By giving his data an open data license others have the opportunity to make advances in his research and even create derivative works.

Some things to consider when choosing a license…

How do you want others to use your dataset?

Your data could be important to solving a pressing issue. We encourage you to maximize your data’s potential by choosing an open license. The more open a license you choose, the more others can use, share and distribute your dataset to get to insights faster.

Your Data will Likely not be Used Alone

When datasets with different licenses are put together, the licenses may conflict and greatly restrict or even prohibit the resulting work. By choosing the most open license, you amplify your dataset’s usefulness. Two datasets, both with CC-BY licenses, can be combined under those license terms (unless they are different versions of those licenses). In addition, licenses may seem similar (like a CC-BY and ODC-ODbL), but there are conflicts between those licenses that prevent them from being combined.

Choose an established and current license

By choosing a license that is established, you’re choosing a license that is widely adopted and was drafted by organizations dedicated to making those licenses functional in many situations as well as making them interoperable, clear and understandable. Make sure you’re using the latest and greatest license, since new versions are developed periodically.

What we at data.world recommend

At data.world, we really love current versions of the open Creative Commons licenses. The Creative Commons has evolved greatly since its inception in 2001, and is now widely adopted. We‘re finding that CC licenses are becoming the more widely accepted for datasets and databases. Also another awesome thing about Creative Commons: They’ve created a tool to help you choose the appropriate license for your dataset.

Why we’re fans of Creative Commons Open License

  1. The open data community needs to come together under one common licensing standard in order to fully empower users to invest in open data to clean it, explore it, use it, combine it and make it more usable and fully maximize its problem solving capabilities
  2. The current versions of the Creative Commons open licenses reflect the significant resources invested in exploring how different licenses interact for derivative works, the special considerations around dataset licensing, the implications of cross-border licensing, and the open data movement, among other considerations
  3. We are seeing the open data community embracing the Creative Common open licenses as the license of choice since they released their latest versions in 2013
  4. By adopting a standard set of licenses for open data, a user coming across datasets with the Creative Commons licenses can easily determine whether any works created by combining those datasets are compatible and permitted under those licenses. A complex licensing scheme combining different license types can be overwhelming to a user and dissuade the user from proceeding with a project that could significantly impact on bettering our lives.
  5. Last, but not least, they’ve given thought on how to streamline the license selection process and have created a nifty Creative Commons tool like this to allow you to quickly identify an appropriate license with simple yes or no questions. They also have a content rich website to help guide you through your licensing considerations.

What about when you share your data on data.world?

There is a dropdown menu of licenses available from the Creative Commons and Open Data Commission. If you can’t find the license you would like to use, select “Other” from the menu and in the Summary add the name of the license which applies to all your files in your dataset along with a link to the full license terms. If you would like files to have different licenses, create separate datasets based on license type and upload your files to the applicable dataset.

To learn more about licensing data, where to look for licensing on data.world, and the ever interesting topic of Fair Use, please check out our General Counsel’s informative Licensing FAQ, which I’ve borrowed very heavily from (she gave me permission).


Learn how to use data.world to collaborate with your professional teammates on your data projects here.