Algorithms are only as good as the data that feeds them. Data scientists are deeply familiar with the impact of GIGO (garbage in, garbage out), where messy inaccuracies, fragmented data, and the like result in poor analysis. But fairness, inclusion, and bias in the data are equally susceptible to this concept, and many teams don’t yet account for it. And unless you know for sure your team has a full understanding of bias in your data, you’re likely making business-impacting decisions on fallible data.
Even the most sophisticated algorithms can deliver misleading results if the supporting data sets are too narrowly (or inaccurately) focused. As this recent Fast Company article notes, a Google image search for the phrase “professional hairstyles” will return primarily white hairstyles, while an image search for “unprofessional hairstyles” sees black hairstyles significantly over-represented.
This isn’t a premeditated decision on Google’s part, obviously: it’s just the result of an algorithm working with a limited dataset that draws too heavily on a specific demographic. The same phenomenon is responsible for the problem of facial recognition software being more likely to accurately identify men and those with lighter complexions.
An emerging practice called data ethnography, however, could play a fascinating role in resolving these issues.
Why this role matters
Traditionally, ethnographers seek to understand how people relate to each other, how meaning is conveyed, and how cultures evolve. A community’s data is really no different: It too requires an understanding and explanation of where it comes from and what it means in order to be more accurately understood and applied.
A data ethnographer looks at an organization’s data sets from a deeper perspective, asking those same traditional questions about the age, culture, and provenance of data. The ethnographer is also responsible for making that information readily available, ensuring it travels with the data it describes.
Essentially, data ethnographers have the task of giving data a more defined identity that contextualizes the viewpoint it provides in analysis. They make it impossible to think of data analytics as a coldly objective math game. There are, after all, always imperfect human elements and inputs involved.
By tracking and labeling data sets from a deeper, culturally-informed perspective, data ethnographers can play a critical role on a modern data team. This process helps ensure that broadly-used algorithms, products and services and other public-facing offerings are more universal in nature, and not skewed toward any particular group.
Data artists explained
Data ethnographers are one of several types of new roles within the emerging category of data artists. Data artists help interpret and contextualize data. They apply these insights (which may draw on psychology, language or ethnography) within an ethical framework to ensure that data is used in a manner that’s responsible, inclusive, and resonant with consumers.
In this manner, data artists work hand in glove with data scientists to ensure that they have a deeper understanding about what the data says, and what that information is conveying about what people truly want and need. By working together, both roles help create compelling links between those who are providing data, and those who are tasked with studying it.
In a very short period of time, data has become perhaps the most valuable commodity on the planet. Learning how to use data in responsible, ethical, and highly-productive ways is one of today’s most difficult and complicated business challenges.
New roles such as data ethnographer and other data artist specialties have a critical part to play in the evolution of the modern data team. Companies that prioritize on staying at the vanguard of such changes will be able to withstand public scrutiny while remaining in the strongest competitive position.
Want to build a data-driven culture faster? Streamline your data teamwork with our Modern Data Project Checklist. Get moving!