In my article Kafka Versioning I have briefly covered different ways of handling versions and using schemas to manage changes to the data structures. However, merely understanding these concepts falls short when operating within an environment characterized by numerous teams and bounded contexts. It is necessary to identify patterns and principles for navigating model changes successfully.

This discussion extends beyond the confines of Kafka, it relates to most integrations between systems and services. I’ll briefly go through two concepts for handling data contracts between teams, DDD Context Mapping and Data Mesh. These concepts can serve as a foundational framework in this intricate landscape.

DDD Context Mapping

Effectively managing breaking changes between teams is not a new problem, and I recommend that you remind yourself about context mapping by revisiting the bounded context chapters in Eric Evan’s book Domain-Driven Design, or exploring What Is Domain-Driven Design? by Vladik Khononov. Khononov introduces, among other things, these three strategies:

(Upstream = U, Downstream = D):

The Conformist – The upstream team holds power, and the downstream team adheres to the upstream team’s model.

Source: https://www.oreilly.com/library/view/what-is-domain-driven/9781492057802/ch04.html

Anti-corruption Layer – The upstream team wields power, but the downstream team builds an anti-corruption layer to shield itself from structural changes:

Source: https://www.oreilly.com/library/view/what-is-domain-driven/9781492057802/ch04.html

Open Host Service – The downstream team is empowered, and the upstream team must ensure their changes do not impact the downstream team, often seen in scenarios like providing a REST API:

Source: https://www.oreilly.com/library/view/what-is-domain-driven/9781492057802/ch04.html

Comment

It’s important to recognize that power dynamics are influenced by business priorities. It’s not always the upstream bounded context (producer) that has to handle the model changes, it can also be the downstream bounded context’s (consumer’s) responsibility.

Moreover, Kafka offers additional options for data transformation compared to microservices. For instance, messages can be transformed to a previous version using tools like KSQL or Kafka Streams:

Data Mesh

For a comprehensive guide on providing and ingesting data among teams, the Data Mesh article https://www.datamesh-architecture.com/ offers valuable insights. Data Mesh focuses on analytical data, an increasingly integral component for most organizations.

In short, with Data Mesh it’s advised against putting the full responsibility for the data in a separate data team. Instead, each domain team should provide a well-defined and documented data product for the other teams to consume. As the article states:

Domain-oriented decentralization for analytical data. A data mesh architecture enables domain teams to perform cross-domain data analysis on their own and interconnects data, similar to APIs in a microservice architecture.

With the Data Mesh concept, they define the architecture like this:

Source: https://www.datamesh-architecture.com/

Comment

Each domain team has a data product that they provide to other teams using a data contract. It’s important to document the policies around this and communicate the concept so that all involved parts are aware what is expected of them. The transition from a Data team to a Data Platform Team providing a Self-serve Data Platform is notable.

Conclusion

Data Product

While the Data Mesh concept may demand a certain level of maturity and experience in the data domain, its principles, particularly the data product approach, offer valuable considerations. The problem with teams that are not domain teams, is that they get distanced from the actual customer requirements. When that happens, focus usually shifts towards other interests. Still, somehow coordination around e.g. policies and contract management have to be done.

If nothing else, I think the data product approach is worth thinking about. What data does each team provide to other teams and how is it packaged by using contracts etc. Only to start using the term data product changes our mindset a bit, I think.

Consider the basic concepts

Before deciding on strategies for managing breaking changes between teams, consider this:

  • Based on business priorities, which team is in power? How can the producing team think about their data as a data product?
  • Can we use Avro or Protobuf to handle, let’s say 75%, of all changes by contract mappings?
  • When is it wise to use versions to handle breaking changes? Can we use KSQL or Kafka Streams to stream data back to older versions, or to do more complex transformation?

Coordination needed

Effective coordination, rooted in communication and agreement on an approach, is indispensable when dealing with breaking changes across multiple teams. Use a step-by-step approach to ensure a smooth transition and try to align with organizational objectives.

1 comment

Comments are closed.

Discover more from Christina Ljungberg

Subscribe now to keep reading and get access to the full archive.

Continue reading