Data mesh at DPG Media
Inspired by an article about Data Mesh by Zhamak Dehghani, DPG Media has decided to move into the Data Mesh direction as well. More information has been written in collaboration with Snowflake. In this blog post, DPG Media aims to share its vision, to focus on the organizational side of things, while being vendor and tool agnostic.
As with many organizations, data is seen as a key enabler for future-proof growth, that is also why “How to become truly data-driven” is a Key Strategic Question for DPG Media. Although it is considered crucial and will have a large impact on the organization (even more than on architecture), DPG Media takes an agile approach to reach its goals and answering the strategic question.
Start with WHY
First things first: before speaking about how DPG Media installs the data mesh culture, we address WHY data mesh answers the question “How to become truly data-driven” and WHY do we want to become truly data-driven.
When looking at where we are in 2021, we notice that data is at the center of our digital strategy and we are launching data-driven products and propositions mainly for our consumers or advertisers. Whilst this is very valuable, we decided not to mark the word consumer in our Key Strategic Question. DPG Media likes to extend our usage of data beyond digital and consumer-oriented data products and insights towards company-wide enablement of a data-driven culture.
Why data mesh? Again we refer to the linked articles, but in summary:
- A central team of highly specialized data engineers has become a bottleneck in delivering all data products we want to build
- It is hard to make choices on priorities for the data team: “who do you enable first, advertising or marketing?”
- It’s impossible for 30 data engineers to understand all applications being built by 600 IT professionals to serve 7000 employees and reach 8 million people on a daily basis.
The transition: an organizational Proof Of Concept
As mentioned above, DPG Media takes an agile approach towards its new data organization form. It has not been a big bang organizational shift. End of 2019, at the start of DPG Media’s data mesh vision, DPG Media started by centralizing data efforts, as the decentralization at that moment had no central governance in place, it was more like a data mess. The first focus was on finding synergies, limiting the architecture, introducing high-level data domains, and transitioning from historical country-specific teams to domain teams. All of this brought a smaller scope to install such governance.
In 1 year we moved away from the country-specific or component-specific (e.g. data ingestion) teams, towards four high-level domains: B2B, B2C, User behavior, and profile. We assign people towards teams and projects with a job fair, where product owners of a domain present the roadmap for the upcoming quarter and data engineers can choose which project to work on (which work inspires you), yet with a focus on building stable teams that can become domain experts.
Next to this shift, DPG Media even went a bit further with an organizational test: what happens when you not only give data engineers a domain focus but place them with the actual domain experts too? Actually, it has been tested in two domains: the user behavior team consists of both tracking specialists as well as data engineers, but in this post, we will focus on the subscription domain, which is a subdomain of the high-level denominator B2C.
Two data engineers were placed within the team that is building a new subscription system, the fifth one… that’s not because we like subscription systems, but due to the fact that we grew by acquisition and with the ambition to migrate all previous ones to the new one. Their product owner suddenly became responsible for defining the roadmap for the data engineers within the subscription domain. A first data product owner outside of the central data team!
Although the roadmap of the subscription team might be one of the most challenging at DPG Media (next to building the system, they need to support the old ones and enable new product set-ups), the outcome of this Proof of Concept was very valuable.
Of course, not all was perfect as of day 1. DPG Media had some lessons learned from the first set-up and these were taken into consideration for the further roll-out of the data mesh.
The data mesh at DPG Media: what is it?
With these lessons in mind, DPG Media re-confirmed that the data mesh was the answer to the Key Strategic Question “How to become truly data-driven”. And to make it more clear we did define guiding principles.
Every important entity (e.g. customer) gets a company-wide definition and definition owner. Although in multiple domains with respect to its bounded context-specific elements might be different for each domain, there is a common understanding of the entity and its granularity (e.g. a customer is an individual, not a household or vice versa).
Every data product has a purpose and a purpose owner. The purpose and technical ownership reside in the most relevant domain. To give an example: marketing automation resides within the marketing domain.
We decentralize data capabilities and lower the technical barrier to enable the entire company to work self-service with their data whilst taking software engineering best practices in mind.
We however centralize data governance and data quality rulings with an emphasis on the word rulings: they are still the domain teams to implement and adhere to these rules, yet a minimal set of rules and expert guidance is provided centrally.
We adhere to the DATSIS principles of the data mesh, yet again we refer to the articles being mentioned above.
As the ownership is mentioned on a purpose level, a technical level, and on a product level, we emphasize that these are three different roles:
- A purpose is owned by a company director
- A product is owned by a product owner in the relevant domain
- As the product owner is most likely someone without deep data knowledge, the technical ownership resides with a data engineer
As it is already rolled out in one domain, has proven its value and the vision is on paper, it rests to have a deep dive on how to move towards this central vision. As moving people from departments bring a budget shift, this is the timing we are working towards:
- Clarify everything around the budget and
- Widely implement this central vision as of early 2022.
Work to be done before is mainly around the following topics:
- Introduce the last missing component from a technological point of view (I was not going to speak architecture)
- Identify current data products and assign owners and domains to do it
- Define the first set of data quality and data governance rulings and enable them
- Identify budget impact from moving people throughout the organization
- Identify and be able to measure the budget impact of moving the usage of data tools throughout the organization
- Build relevant data communities
These communities are focussing on multiple items.
- The first one is a more typical guild set-up where people can learn from each other across domain borders in the technical usage of data engineering tools, data science models, …
- The second one is focused on having communities of experts who can assist multiple departments with their knowledge of tools (like self-service visualization tools).
- Next to that we also bring data governance communities to live. One more from a data steward perspective (the product owners), one more from a data governance implementation perspective (technical owners).
If you think this is an inspiring data story, do not hesitate to reach out to our recruiters, as we are still looking for people to assist in implementing this vision!