What is DataOps?

by | Apr 9, 2021 | Agile Data Governance, Data catalogs

Imagine this: it’s time for your family’s weekend getaway after a tough week at work and school. In the midst of the fun, you get a text that a critical data pipeline broke at work, and you need to fix it immediately.

Now your weekend plans are on hold as you head back to your laptop to fix the issue. Frustrated by work pulling you away from quality family time, your partner asks, “can’t you build something that doesn’t break on Saturday?”

This is exactly what happened to Chris Bergh, CEO of DataKitchen, who shared this anecdote with us at data.world summit 2021.

The pain is real. And you can’t ignore it. Data engineers and scientists are acutely aware of this. Processes breaking, integrations failing, confusing workflows,  suddenly missing or bad data. On the weekends. After hours. The state of data operations today is messy. Fighting endless fires and answering ad-hoc questions and complaints in a constant reactionary state is unsustainable. It limits your potential and slows business ROI.

 

Enter: DataOps

More and better technology is not the way to address these problems (don’t fall into the technology fallacy). Instead, it’s your people, process, systems, and workflows — together.

In the development world, we’d call this DevOps. But analytics can and should be treated like code too. And we need a parallel methodology for data: DataOps.

DataOps unlocks the flow of data and information between data producers and consumers, allowing them to better use it in their day-to-day work. It considers not just the technology but the people and the process. It enables people to build an efficient, scalable, and maintainable data supply chain.

With DataOps, you can focus on rapidly and reliably getting data to production, enabling trust, and fostering data as a team sport.

 

“You’ll see a 30-40% increase in usage of your data if you actually talk to your users” - Ashleigh Faith, Director at EBSCO during Catalog & Cocktails

Hear our full conversation with Ashleigh Faith in this episode of Catalog & Cocktails.

 

How does it work?

There are five core tenets as you approach DataOps as a solution:

  1. Observability: use technology that serves as an information radiator for your data, and make it easy to incorporate this into analytics work
  2. Enablement: no matter your role, data should be intuitive to find, understand, trust, access, and ready-to-use
  3. Collaborative: your team should be able to interact and use data cross-functionally, and don’t forget to capture knowledge and feedback from your subject matter experts
  4. Iterative: combine human-led and machine-led curation to quickly deliver data, and continuously improve older data & analytics resources
  5. Open ecosystem: create an environment that’s interoperable with your current, future, and best-of-breed data technology stack with open standards and APIs

Building this system allows your team to create better, more usable data products, improve customer satisfaction, enhance strategies, and innovate exponentially.

To put it simply, DataOps adds massive value to your business by enabling better workflow and knowledge sharing between your data producers and consumers. Now, you can truly succeed as a collaborative team.

 

Who’s involved?

Process-focused data analysts, scientists and engineers are at the heart of your DataOps strategy, says Bergh. He adds that while release engineers of the past were once the lowest paid developers, today’s DevOps engineers are the highest paid and make up 30% of engineering teams.

They’re valuable for a reason: DevOps is pivotal to collaborative workflows that allow teams to ship faster and be more productive. DataOps brings that value to your data supply chain, and for your data consumers and producers.

 

"It's more than our skills, it's more than our passion. We bring the best of ourselves together to work on these problems." DJ Patil, Former US Chief Data Scientist, speaking at data.world summit 2020

Listen to our full conversation with DJ Patil, Former US Chief Data Scientist.

 

If it isn’t already obvious, you must have a team! And you need them to interact and face your data challenges together. While solo analytics work is feasible for a while, it simply isn’t scalable, and your workflows won’t mature.

Data is a team sport. Keep those channels open. Agility and transparency are key as you and your team continues to integrate DataOps strategies into your day-to-day work.

Learn more: consider a data product manager for your team.

 

Technology isn’t the solution, but it is a part of it

According to Rethink Data, a DataOps tool must combine the functionality of “metadata management, data classification, and policy management.”

A best-in-class enterprise data catalog occupies a unique spot in the data tech stack and supports these three critical capabilities and more. It should be the backbone needed to adopt DataOps. Modern data catalogs are built for: 

  • People: a user experience built for everyone who works with data: from producers such as data engineers and stewards, to data consumers such as analysts and scientists
  • Process: a unified, centralized experience to manage and use data and analytics resources, and documentation to better understand it
  • Technology: a connected hub that adds a unified, friendly layer to a complicated architecture, and is interoperable with your toolchain

A data catalog serves as your unified hub to bring people, process, and technology, together. Everyone must be involved in managing, curating, and overseeing the data catalog to ensure your supply chain keeps humming along, and continuously improving.

 

“In a recent Forrester survey - only 56% of enterprise IT groups report adopting agile principles in the data and analytics practice” - Michele Goetz, VP & Principal Analyst at Forrester

 

The future of data is agile

We need to manage data as a product, and we need to bring agility. This is the clear path forward to sustainable, scalable integration of data and analytics into your business. 

As a data engineer or data scientist, you shouldn’t feel like you’re constantly fighting one fire after another. You shouldn’t need to constantly throw life preservers (that should be a last resort, not the default action). DataOps is about building a sustainable bridge that enhances reusability and reproducibility, and enables you to work on higher impact projects and initiatives.

DataOps starts by pulling together your team’s data, analytics, workflows, knowledge, and operations in one central location. Foster openness, collaboration, and reuse to unlock that critical business value, and be competitive in today’s data-driven economy.

It’s time to unleash groundbreaking analytics and accelerate data delivery in your organization.

 

Learn more

Ready to dive into the world of DataOps? We recommend checking out these resources, too: