Data mesh
Data mesh is a architecture framework of an operating model for domain driven development of data products and applications that embraces the ubiquity of data in the enterprise by leveraging a domain-oriented, self-serve design (in a software development perspective), and borrows Eric Evans’ theory of domain-driven design.[1] The main proposition is that instead of building large, centralized data platforms, enterprise data architects should create distributed datasets, or data meshes, that are managed by a particular group of human experts.[2] These domain teams know how to design their data into data types and file formats that will meet the needs of data consumers throughout an organization.[2]
History
The term data mesh was first defined by Zhamak Dehghani in 2019[3] while she was working as a principal consultant at the technology company ThoughtWorks.[4][2] Dehghani introduced the term in 2019 and then provided greater detail on its principles and logical architecture throughout 2020. The process was predicted to be a “big contender” for companies in 2022.[5][6] Data meshes have been implemented by companies such as Zalando,[7] Netflix,[8] Intuit,[9] VistaPrint and others, thanks to software factories such as Agile Lab,[10] which has been described as one of the first companies in the world to approach data mesh in a concrete way.[11]
Principles
Data mesh are now defined by 4 principles:[4][12]
- Domain-oriented, decentralized data ownership and architecture (data is locally owned by the team responsible for collecting and/or consuming the data[4])
- Data as a product[13]
- Self-service data infrastructure as a platform
- Federated management of computing resources
In addition to these principles, Dehghani writes that the data products created by each domain team should be discoverable, addressable, trustworthy, possess self-describing semantics and syntax, be interoperable, secure, and governed by global standards and access controls.[14] In other words, the data should be treated as a product that is ready to use and reliable.[15]
See also
- Data management
- ETL and ELT
- Data warehouse, a well established type of database system for organizing data in a thematic way
References
- Moses, Barr (19 August 2021). "What is a Data Mesh — and How Not to Mesh it Up". Medium. Retrieved 28 January 2022.
- Andy Mott (2022-01-12). "Driving Faster Insights with a Data Mesh". RTInsights. Retrieved 2022-03-01.
- "How to Move Beyond a Monolithic Data Lake to a Distributed Data Mesh". martinfowler.com. Retrieved 28 January 2022.
- Baer (dbInsight), Tony. "Data Mesh: Should you try this at home?". ZDNet. Retrieved 2022-02-10.
- "Developments that will define data governance and operational security in 2022". Help Net Security. 2021-12-28. Retrieved 2022-03-01.
- Bane, Andy. "Council Post: Where Is Industrial Transformation Headed In 2022?". Forbes. Retrieved 2022-03-01.
- "Time To Stop Messing With Data Mesh | CDOTrends". CDOTrends | Digital & Data Insights for Business Leaders. Retrieved 2022-03-01.
- Netflix Data Mesh: Composable Data Processing - Justin Cunningham, retrieved 2022-04-29
- Baker, Tristan (2021-02-22). "Intuit's Data Mesh Strategy". Intuit Engineering. Retrieved 2022-04-29.
- "Data Mesh, 10 consigli pratici per adottarlo con successo in azienda". 01net (in Italian). 2022-03-11. Retrieved 2022-04-29.
- "Data Mesh: the newest paradigm shift for a distributed architecture in the data world and its application - Webthesis". webthesis.biblio.polito.it. Retrieved 2022-04-29.
- Andy Mott (2022-01-12). "Driving Faster Insights with a Data Mesh". RTInsights. Retrieved 2022-03-01.
- "Data Mesh defined | James Serra's Blog". 16 February 2021. Retrieved 28 January 2022.
- "Analytics in 2022 Means Mastery of Distributed Data Politics". The New Stack. 2021-12-29. Retrieved 2022-03-03.
- "Developments that will define data governance and operational security in 2022". Help Net Security. 2021-12-28. Retrieved 2022-03-01.