When Data Becomes a State Matter: The Genesis of a Now-Indispensable Governance
In the 1990s, the advent of data warehouses exposed the limitations of siloed transactional sys-tems, initiating the first standardization efforts. Historical sources do not allow us to trace this dis-cipline back beyond the 1990s, the period in which the first concepts and practices that define it emerged. The turn of the millennium—marked by major financial scandals and the adoption of the Sarbanes-Oxley Act—propelled the structuring of organizational practices centered on mastering data. This article explores the genesis of data governance, shaped by the dynamic interplay between technological advances and regulatory imperatives. The analysis highlights the continuity between these two crucial periods, revealing the technical and organizational concepts that laid the founda-tions for the data governance we know today.
GOUVERNANCE DES DONNÉES
Charles Ngando Black
8/20/20256 min temps de lecture
Introduction
The idea that data could be a strategic asset for an organization did not emerge spontaneously. Long seen as a byproduct of operations, data management was primarily a technical and software concern. It was at the convergence of two distinct yet complementary movements—one stemming from the rise of Business Intelligence in the 1990s, the other fueled by regulatory imperatives in the early 2000s—that the concept of data governance truly took shape.
This convergence crystallized around pioneering efforts: in 2003, The Data Warehousing Institute (TDWI) formalized early “data governance” concepts, while practitioners such as Larry English and Gwen Thomas—who would found The Data Governance Institute in 2004—laid the theoretical foundations of an autonomous discipline. In parallel, IBM played a unifying role, developing its first “Information Governance” consulting offerings and rallying multiple organizations around this emerging issue. This article retraces that genesis, highlighting the respective contributions of business intelligence and compliance requirements—two driving forces that together gave birth to a discipline now considered a cornerstone of any truly data-driven organization.
The Preconditions: An Information Environment in Transition
Before the 1990s–2000s convergence, the technological and regulatory environment lacked the conditions necessary for data governance to emerge. The 1980s were characterized by information systems centered on isolated applications, each database serving a specific purpose without cross-functional integration. The very notion of an enterprise “information asset” did not exist, as data was viewed as a technical byproduct rather than a strategic resource. Likewise, the absence of a binding regulatory framework for the reliability of financial information created no institutional pressure to formalize data management practices. It was only with the convergence of technological innovations in business intelligence and the regulatory requirements of the early 2000s that the conceptual and organizational foundations of data governance could emerge.
1. Business Intelligence: Catalyst for Awareness of Information Chaos
The 1990s saw the massive adoption of transactional systems such as ERP (Enterprise Resource Planning) and CRM (Customer Relationship Management). While these systems optimized operational processes, they also fragmented the corporate information landscape—compared to earlier decades when data remained confined to isolated applications with little interaction. The proliferation of heterogeneous data sources, often inconsistent and poorly integrated, made cross-functional and coherent information use complex, if not impossible—a problem that simply did not arise in the technical environment of the 1980s.
To address this growing fragmentation, the concept of the data warehouse, popularized by the pioneering work of Bill Inmon and Ralph Kimball, introduced a novel approach based on the integration and centralization of data for analysis and decision-making. These decision-support systems quickly revealed the need to harmonize business definitions across disparate source systems, to establish data glossaries—often implicitly at first—and to develop increasingly rich technical documentation around metadata.
Fundamental tools such as ETL (Extract, Transform, Load) processes emerged to clean, transform, and integrate data before loading it into the warehouse. In parallel, OLAP (Online Analytical Processing) tools allowed users to explore data multidimensionally, providing new perspectives for understanding business activity.
However, this first wave of decision-support initiatives remained largely dominated by IT teams. Data quality and governance were still barely visible at the business level, in the absence of significant end-user involvement, formally established processes, or dedicated roles such as data stewards.
Despite these limitations, this period marked a significant cultural turning point. Data gradually ceased to be perceived as mere transactional flows and began to be recognized as a transversal asset—requiring greater interdepartmental collaboration, rigorous documentation, and control mechanisms to ensure its reliability and relevance.
2. The Compliance Imperative: A Catalyst for Structuring Governance
The early 2000s brought a sudden awareness of the risks linked to poor data management, notably through a series of major financial scandals such as Enron and WorldCom. These incidents revealed failures that could not have occurred in the technical environment of earlier decades, where siloed systems inherently limited the scope of large-scale financial data manipulation. They highlighted the central role that erroneous or manipulated financial data could play in large-scale accounting fraud, made possible by the growing interconnection of information systems.
The legislative response to these crises was swift and significant, with the United States adopting the Sarbanes-Oxley Act (SOX) in 2002. This law imposed strict standards for the certification of financial reports by executives and for the implementation of rigorous internal controls over the processes that produce and disseminate financial data.
To comply with these new regulatory requirements—particularly the obligation for executives to personally certify the reliability of financial information (Section 302) and to ensure the auditability of data processing (Section 404)—organizations were forced to reassess and reorganize their information management practices. This transformation drew on the expertise of firms such as IBM Global Services and Accenture, which bridged the gap between BI’s technical concepts and the new organizational requirements of compliance.
Data traceability, secure information archiving, and exhaustive documentation of processes became non-negotiable obligations. This new reality required clearly designated business owners responsible for data quality and integrity, along with the deployment of technical solutions tailored to ensuring compliance.
In this demanding regulatory context, practices such as data lineage and Master Data Management (MDM) spread rapidly. MDM, in particular, enabled unified and coherent views of critical entities such as customers, products, or suppliers by consolidating information that had previously been scattered and potentially contradictory across different operational systems. These initiatives marked a decisive step in structuring data governance, which progressively ceased to be perceived as a mere IT project and began engaging the responsibility of the entire organization.
3. Data Governance: At the Crossroads of Culture, Compliance, and Strategy
The history of data governance is that of a gradual convergence between the intrinsic need for technical coherence—initially driven by the goal of improving decision-making—and the imperative of transparency and compliance imposed by an increasingly demanding regulatory environment. While in the 1990s the main motivation was analytical—improving decision-making by eliminating data silos and promoting an integrated view of information—by the early 2000s it had become fundamentally legal and organizational: ensuring regulatory compliance, managing operational and financial risks, and protecting the company’s reputation.
It is within this convergence that the modern concept of “data governance” truly emerged. Gwen Thomas’s work at The Data Governance Institute, founded in 2004, formalized an approach that went beyond technical management to encompass organizational and strategic dimensions. In parallel, Larry English developed his concepts of stewardship and business accountability, which continue to shape the discipline today.
Over time, key technical concepts such as the “Single Version of the Truth” found their extension in the establishment of centrally managed, coherent critical data repositories. The rudimentary ETL scripts of the early days evolved into industrialized data quality processes incorporating rigorous validation and control procedures. Metadata catalogs, initially conceived as simple technical inventories, expanded into true tools for managing and understanding the information asset, while validation workflows were implemented to structure and secure access to sensitive data.
Thus, data governance gradually moved beyond the exclusive domain of technical support to become a cross-functional management function, closely linked to the company’s overall strategy, performance objectives, and compliance obligations. It has become essential for ensuring the reliability of information used in decision-making, protecting against regulatory risks, and fully exploiting the strategic potential of data.
Conclusion
Far from being a discipline that appeared out of nowhere, data governance was built through the gradual layering of practices and concepts. Its roots lie in the first attempts at standardizing and centralizing data initiated by 1990s business intelligence, but its institutionalization and recognition as a critical function stem from the heightened regulatory pressure of the early 2000s. This dual origin gives it an inherently hybrid status: at once an organizational and technical infrastructure, an unavoidable compliance requirement, and a strategic lever for value creation.
Understanding this complex genesis is essential to better grasp the tensions and challenges that still run through data governance today—balancing the need for control and security, the imperative of agility and flexibility, and the ambition to turn data into a true driver of innovation and growth.
References
English, L. P. (1999). Improving data warehouse and business information quality. John Wiley & Sons.
IBM Global Services. (2004). Information governance framework: A strategic approach to data management. IBM Corporation.
Inmon, W. H. (1992). Building the data warehouse. John Wiley & Sons.
Khatri, V., & Brown, C. V. (2010). Designing data governance. Communications of the ACM, 53(1), 148-152.
Kimball, R., Reeves, L., Ross, M., & Thornthwaite, W. (1998). The data warehouse lifecycle toolkit. John Wiley & Sons.
Loshin, D. (2008). Master data management. Morgan Kaufmann.
Redman, T. C. (2001). Data quality: The field guide. Digital Press.
Sarbanes-Oxley Act of 2002, Pub. L. No. 107-204, 116 Stat. 745.
Seiner, R. S. (2004). "Data stewardship: An actionable guide to effective data management and data governance". Academic Press.
The Data Warehousing Institute. (2003). Data governance and stewardship: Emerging practices. TDWI Research.
Thomas, G. (2004). "The data governance imperative". The Data Governance Institute.
Watson, H. J., & Wixom, B. H. (2007). The current state of business intelligence. Computer, 40(9), 96--99.