Recap: data.world spring summit 2022 – Open Data for the Win Track

by | Apr 28, 2022 | 2022, data culture, POV

If you missed the spring 2022 data.world summit, have no fear.

You can now watch sessions from our live, virtual, bi-annual event, where we invite industry luminaries to discuss data mesh, open data, knowledge-first, and so much more!

We’ve already recapped the Knowledge First and Practitioner’s Paradise tracks. Here, we’re recapping the Open Data For the Win track, where we showcased the transformative power of open data in society, public projects, journalism, and the enterprise.

Here, we recap our Open Data for the Win track.

The Why and How of Open Data

Amber Thomas, Community Lead at data.world, kicked off the Open Data for the Win track with her presentation, “The Why and How of Open Data.”

Amber Thomas speaks to the principles of FAIR data next to a bulleted list

Amber is responsible for data.world’s data for good and social impact initiatives, and the largest and longest-running of these initiatives is our open data platform, where over 1.5 million people have come to find and contribute to hundreds of thousands of free and open data sets.

Amber began her talk by comparing the structure of the worldwide web — and the subsequent accessibility of information — to the idea of “open data,” data that “can be freely used, modified, and shared by anyone for any purpose.” Amber illustrated the incredible value of open data by explaining how the sharing of COVID data globally helps journalists and policy makers during the pandemic.

Amber then explained how you can open your data and make it more available for other people to answer questions and solve problems, detailing how to obtain an open data license, how to structure open data for easy use, why it’s important to use a non-proprietary data format like .csv, .json, or .txt., and more.

Next, Amber introduced the FAIR data principles; data that’s:

  • Findable
  • Accessible
  • Interoperable
  • Reusable

“FAIR data is really about reuse,” she explained. “It’s about providing the information that people need to use your data for another purpose.”

Making US Hospital Pricing Data More Open

Amber then joined data.world Principal Solutions Architect Dean Allemang to share how data.world is helping to document public hospital pricing data, allowing people to more easily compare prices for standard procedures at hospitals across the US.

Amber Thomas and Dean Allemang talk about data.world's effort to document public hospital pricing data

“A few years ago, the center for Medicare and Medicaid services released a ruling that hospitals across the country had to publicly release standard charges, or how much money you pay for specific goods or services rendered at a specific hospital,” Amber explained. “This seems like a really great thing, because if you’ve ever tried to get medical care here in the US, figuring out how much that’s going to cost is hard.”

“What actually excited us a lot about this was this ruling actually requires hospitals to release standard charges, both in a consumer friendly format and in a machine readable file,” she continued. “So there are all sorts of files that meet some level of open data now available, but it turned out all of these files were scattered across individual hospital websites. We wanted to combine them.”

Dean and Amber spent the remainder of the session explaining the technical steps data.world took to combine hundreds of disparate data files from different hospitals, all of which were formatted differently, eventually creating a single resource  

“And that’s sort of the big idea of the web and of data on the web,” explained Dean. “If I can be a good data citizen on the web, by taking data that’s kind of obtuse and annoying, and work on it a bit for my own value, when I’m all done, I can share it out.”

“This is what fair data is all about,” he concluded. “The idea is that when you make your data FAIR, now anyone in the world can find it, access it, operate with it, and reuse it.”

The Power of Open Data at WPP

Mircea Danciulescu, Global Data Manager at WPP, joined us to illustrate how his organization, one of the largest marketing and advertising holding companies in the world, leverages open data, and how they approach its use.

WPP's Mircea Danciulescu speaks to the “four powers” of open data

Mircea began his presentation by sharing a quote from our very own Juan Sequeda: “When the world’s data is transformed into knowledge, opportunities emerge for everyone.”

“The data that is available to us is very wide and varied,” he said. “Because we have access to such a broad range of data, we have established data ethics guidelines that data should be shared and used responsibly, well managed, and well used. And that’s why we invested in a data catalog.”

Mircea then spoke to the “four powers” of open data — see above — and how access to the “brilliant data of the world’s community” provides WPP with much more than if they relied upon only the data they produced internally.

To illustrate his point, Mircea played a short video his company produced for the 2020 US census.

“This example in particular resonates because we used open data in conjunction with survey data, and also proprietary data from our client, to drive the adoption and the engagement with the US census in 2020,” he explained. 

Lessons from Stacker Media’s Open Data Journey

Stacker Media’s Vice President of Distribution Ken Romano and Data Reporter Emilia Ruzicka joined us to talk about how Stacker uses open data to empower the world’s publishers.

Stacker Media’s Vice President of Distribution Ken Romano and Data Reporter Emilia Ruzicka talk about how Stacker uses open data to empower the world’s publishers.

“Stacker is a newswire like AP or Reuters, but specifically for data-driven features; any sort of evergreen feature journalism that begins with data or research,” explained Ken.

“Everything that we do, everything that we think about is in service of other publishers and how to empower them, to tell the stories that impact their communities and the audiences in which they live,” he continued. “At the root of what we do is accessibility. Our stories are accessible to any reader, regardless of your expertise or experience. Our news wire is accessible to any publisher regardless of what technology platform they’re on. And all of our stories can be republished from our website under a creative commons license.”

Emilia then spoke to the importance of open data in journalism, and how it provides an archive of shared resources between journalists who work across organizations, as well as between journalists, the media, researchers, academics and the general public.

“Providing open data gives a hand up to smaller or local newsrooms who might not have as large of a team to dedicate to things like deep dive data investigations,” she said. “In addition, open data gives the general public an insider look into the black box of data, a peak under the hood, in order to build a greater level of trust between journalists, researchers, government organizations, nonprofits, and of course the public consuming the news each and every day.”

Emilia went on to explain how Stacker compiles and shares their datasets:

  • Releasing the data set in a CSV format accessible on data.world, on GitHub, and on Staker’s own site
  • Writing a story about the data set that we’ve created and analyzed and sharing it under our creative commons license so other news organizations can republish 
  • Releasing a blog post describing the analysis and cleaning process for the data, providing context for users

911 Systems and Open Data: The Reimagine 911 National Action Team at Code for America

Billy Lim of Code for America spoke to his organization’s efforts to reimagine the US 911 system from a service that sometimes results in devastating harm and lives lost — particularly in communities of color — to a service that gets individuals experiencing crisis the right help from the right service at the right time.

Billy Lim of Code for America speaks to his organization’s efforts to reimagine the US 911 system

“The 911 system fields an estimated 240 million calls every year,” Said Billy. “But the infrastructure supporting 911 operations varies radically from jurisdiction to jurisdiction, particularly as it relates to the degree of technological sophistication associated with routing, responding to and documenting this tremendous call volume. There are huge opportunities to modernize these 911 systems by generating more accurate information in support of both efficient and equitable responses to 911 calls, and by facilitating data sharing and interoperability between different dispatcher systems and responders.”

Billy went on to detail how one of the best ways his organization can affect positive change is by making strides towards open standardization of nine one one call data. The benefits of this, he said, are many and include:

  • Enabling a more accurate assessment of risk and reducing unnecessary police response and biased outcomes
  • Ensuring the quick rerouting of 911 calls to alternative hotlines
  • Modernizing data systems could improve the wellbeing of 911 professionals and first responders by reducing strain and burden in their everyday work
  • Empowering jurisdictions to collaborate and share data in ways that under the status quo are simply not possible

“Our nation’s 911 system is incredibly complex and its systems have not kept pace with the needs of its stakeholders, and less the people that rely on it,” said Billy. “The work of Code for America’s Reimagined 911 National Action Team demonstrates that civic technologists, advocates, and organizers can come together to demonstrate real possibilities for systems-level change. I’m inspired every day by the work that continues around making 911 call data more accessible and open.”

In case you missed it…

Missed the data.world spring summit? No sweat! You can watch every session on demand.

Watch the data.world spring summit