data.world is a Public Benefit Corporation
When data.world became a Public Benefit Corporation (PBC) in 2016, we committed to an ambitious mission to:
Build the most meaningful, collaborative and abundant data resource in the world in order to maximize data’s societal problem-solving utility.
Advocate publicly for improving the adoption, usability, and proliferation of open data and linked data.
Serve as an accessible historical repository of the world’s data.
If this is your first exposure to a Public Benefit Corporation, I’m so happy that data.world is the one that gets to be your introduction. In 10 years, we think that the question will be “why aren’t you a public benefit corporation” rather than “why are you?”
A PBC is similar to a traditional corporation in that it is a for profit corporation with shareholders who own the company and its assets. The key differentiator is that a PBC also has a clear mission to consider the impact of its actions on society. This means that a PBC can make decisions to do the things “for the good of humankind” which are set forth in its charter (and are given the legal protection to do so). As a result, a PBC can balance that mission with the pursuit of shareholder returns rather than being forced to maximize shareholder value at all costs.
While we are incorporated in Delaware since it was one of the early adopters of the PBC structure, we have since had a hand in bringing this structure to Texas, and look forward to the day when it becomes the accepted norm among new businesses globally.
As a part of our PBC status, we’re obligated to give periodic updates to our shareholders on the status of our PBC purpose. However, we feel very strongly about the value of transparency and strive to be as open as we can with our community, which is why we’re sharing this report with all of you.
data.world Benefit Purpose
As a data-driven organization, we felt the best way to share our progress was to look at what we’ve done for each component of our PBC public benefit purpose:
- strive to build the most meaningful, collaborative and abundant data resource in the world in order to maximize data’s societal problem-solving utility,
- advocate publicly for improving the adoption, usability, and proliferation of open data and linked data, and
- serve as an accessible historical repository of the world’s data.
Building the most meaningful, collaborative and abundant data resource in the world in order to maximize data’s societal problem-solving utility
We’ve undertaken an ambitious journey to unleash the potential of data by bringing together that data and the people who are interested in it. We’re happy to report some of the advances we’ve made on this front.
U.S. CENSUS DATA
Early on we spent a fair amount of time making sure that the American Community Survey (ACS) data was available and linkable without charge for our members. We have collaborated with the US Census team on events and continue to work hard to help people use demographic data in their own data projects.
SOCIAL IMPACT DATA
We have been working hard to help source or host high-value datasets with an eye toward social good, social change, or other impactful areas. This can help researchers and data scientists identify, understand, and combat challenges in today’s social landscape. This includes a month-long datathon around Black History Month, providing emergency resources for hurricane relief, working with the Anti-Defamation League to provide data around hate crimes and online hate speech, and many others.
While data itself is interesting, its power is limited without things like context, provenance, and the array of metadata that is generated in the course of a data project. We’ve been building tools that allow people to contextualize their data and centralize all of their data work so that the knowledge stays with the underlying data. This includes the conversations that are so often lost “at the watercooler” and other tribal knowledge that isn’t widely available to everyone who relies on the data.
data.world is quickly establishing itself as a platform that data journalists find helpful and transformative. The Associated Press was our first customer. It uses data.world to magnify its data science resources and distribute localized data to member newsrooms around the country.
A number of collaborative communities have started and grown on data.world. They use our platform to make data collaboration and sharing easier and more effective. Makeover Monday is a prime example. The group runs a weekly data visualization exercise to help their community collaboratively improve their data visualization skills.
With the increased scrutiny around social activism, of all kinds, data-driven research is a necessity. Groups like Data for Democracy and Code for America have used data.world to ensure that the research and data gathered in the pursuit of a better society can be checked and reused by as many people as possible.
Between our members, our efforts, and related communities, data.world now has over 100,000 datasets on the platform and the rate of growth is accelerating at an exciting pace.
IMPACT ON OUR COMMUNITIES
Whether it’s civic activism, health research, or citizen journalism, there has been quite a bit of work done on the platform that is having an impact on some of the most complex problems facing humanity today. Two of our favorites:
- Social Media Bot Detection by Paragon Science
- Removed Facebook Pages: Engagement Metrics and Posts by Jonathan Albright
Advocating publicly for improving the adoption, usability, and proliferation of open data and linked data.
DATA PRACTICES MANIFESTO
With data science still being a relatively new field, there is still a great deal of fragmentation between concentrations (AI, ML, deep learning, data journalism, visualization, etc.) and quite a bit of growth needed before those who aren’t already “data people” can understand and adopt the principles of good data teamwork. We were honored to sponsor a gathering of data science leaders, which resulted in the Data Practices Manifesto to help organize thinking around data practices and develop exercises to bring the power of data to anyone who is interested.
In collaboration with some of Austin’s most exciting tech startups, The University of Texas, and USAA, data.world has helped to kick off AI Global, a new nonprofit to advocate for the responsible development and deployment of artificial intelligence. This organization will focus on using ‘Responsible AI’ in businesses, communities, and academia to create a climate where all aspects of society understand and prepare for the impacts of AI. “As a founding member of AI Global, we’re delighted to see this innovative offering,” said Jon Cholak, Investment Director, USAA. “The AIGlobal Marketplace will really help software developers, data scientists, AI researchers, and product managers share high value assets, best practices, and enable the industry to innovate at a faster pace.”
While we feel that data is a resource best shared openly with the world, we also realize that there will always be data that either needs to be private or restricted in its applications. Licensing of data is complicated and obtuse, and we’ve spent time helping educate people and organizations on thinking through licensing implications so that data can be shared with the broadest set of people possible, but still have assurances that people will use it in accordance with the owner’s wishes. Whether this is just a private dataset shared within an organization, a dataset that bridges worlds of open and private with the new Community Data License Agreement (CDLA), or a dataset built for the improvement of all humankind under the Creative Commons, there is an option that will work for data owners.
Even before we publicly launched, we have been on the road advocating for open data from the White House Open Data Initiative to the UN World Data Forum, and that pace has only increased in the past two years. Whether this is collaborative data events with groups like the US Census Bureau, Tableau, Code for America, and Data for Democracy or bringing data to tangentially related events like the European Space Agency’s ActInSpace hackathon, we’re trying to make sure that open data, especially semantically linked data, is a tool that anyone can employ to make the world better. We’d love for you to join us in this effort, if you would like help promoting an event, preparing a talk, or just knowing where the best data events will be, feel free to drop us a line.
SUPPORTING LINKED DATA
Data should be easier to find, understand, and use. We make data “meaningful” using Semantic Web technology. We built our platform on top of semantic and linked data technologies to bring their power to the masses without the high learning curve and other steep barriers to entry. We often liken it to the early days of the web, when writing a webpage was a difficult task, until tools like content management systems and user-friendly web design tools made it a simple thing that anyone could do. We are striving to make all data easily linked and for that technology to be accessible to the broadest group of people possible. This improves data discovery and interoperability so people and machines can unlock its value faster. We’ve built a network where new datasets energize and enhance everything they connect to. For those unfamiliar with the concept, Linked Data is one of the core tenets of the Semantic Web. The idea is to form relationships between data, similar to the relationships that were formed between documents by the World Wide Web. These relationships will be not only consumable by humans, but meaningful to machines. While these concepts are not new, historically they have been very difficult to use in practice. By building our platform entirely on Linked Data and Semantic Web technology, we’re making it easy for both humans and machines to consume the vast collection of open data available on data.world. When each dataset that is added to the network increases the incremental value of every dataset on the platform, the network effect can be an incredible multiplier for people solving the world’s most difficult problems.
The exciting part about the data landscape is that there are so many conferences, events, and initiatives helping to grow the impact of this community. We’re trying to do our part by participating in as many as is feasible for a growing startup. Some of the places that you will find us in 2018 include:
- The International Semantic Web Conference
- The Open Data Institute’s Annual Summit
- Tableau’s 2018 Annual Conference
APIS AND INTEGRATIONS
We realize that advocating for the use of open data means more than just talking about it. It also includes making tools and documentation that help people walk the walk. We work hard on our open APIs and integrations so that people can use our platform to power the next generation of data applications. We have open sourced all the integrations we have built and encouraged others to do the same. Anyone can use the open side of our platform for free, so we’re leaving the door wide for people to be able to utilize the power of open, linked data, and semantic web technology without the steep learning curve of days past.
Serving as an accessible historical repository for the world’s data
By most conservative estimates, there are more than 18 million open datasets available today. Unfortunately, the vast majority of these data are hard to find, harder to access, and difficult to work with. data.world is working hard to change all that with automated data ingest from many open data platforms, APIs that allow for automated upload of data by anyone who has data to share (whether atomic, or streaming), and integrations to many of their favorite tools. We also believe that the data ecosystem should provide tools and storage to more than just open data. Many organizations are using data.world to collaborate on data internally, while being able to capitalize on the world of open data at their fingertips.
Just as software development has established rules for keeping track of changes to code, data needs to be tracked in order to be able to determine whether a dataset has changed, cite specific versions, and set expectations about how each version will differ. While data.world has always tracked the changes to a dataset, this past year we exposed versioning functionality to our users, including being able to download older versions. We feel this is an important step in the sophistication of data work and will continue to iterate on this functionality over time.
Promoting a culture of social change
While our PBC benefit purpose is very explicit around the data ecosystem, we feel that the principles established there should hold true in all aspects of our lives, and we work hard to participate in our communities in the best ways possible.
data.world has set up a partnership with Encast/WorkHero to create individual accounts into which we as a company can match donations to causes that our employees feel strongly about. Employees can also use this account to track their volunteer hours and can take the account with them wherever they go. We also have a number of initiatives that have benefited worthy causes like The Andy Roddick Foundation.
Companies and communities benefit when businesses build diverse teams. We believe that building the best team possible means hiring the best candidate from a pool that represents a wide variety of backgrounds and experience, and we’re pleased to report that our commitment to diversity has resulted in increasing the diversity of our team. The proof is in the pudding and continues to yield encouraging results which include:
Named one of Austin’s best places to work 3 years running
2017 A-List winner
2017 50 on Fire in Austin
We are also happy to report that we were one of the first companies to be independently certified by SameWorks that we provide equal pay for equal work. SameWorks conducted a third-party audit of our workforce compensation based on transparent standards in determining our qualification for this certification.
Certified B Corporation
In addition to being a Public Benefit Corporation, data.world is also a Certified B Corporation, and has been named to the “Best for the World” list two years running. BCorporation.net describes a Certified B Corporation as:
“Individually, B Corps meet the highest standards of verified social and environmental performance, public transparency, and legal accountability, and aspire to use the power of markets to solve social and environmental problems.
Collectively, B Corps lead a growing global movement of people using business as a force for good.™ Through the power of their collective voice, one day all companies will compete to be best for the world, and society will enjoy a more shared and durable prosperity for all.”