“There Is No AI Without Data”
Communications of the ACM, November 2021, Vol. 64 No. 11, Pages 98-108
By Christoph Gröger
“AI has not yet delivered on the promises in industry practice. The core business of industrial enterprises is not yet AI-enhanced.
We see a major need for future research regarding functional capabilities and realization technologies for an enterprise data marketplace.”
Artificial intelligence (AI) has evolved from hype to reality over the past few years. Algorithmic advances in machine learning and deep learning, significant increases in computing power and storage, and huge amounts of data generated by digital transformation efforts make AI a game-changer across all industries. AI has the potential to radically improve business processes with, for instance, real-time quality prediction in manufacturing, and to enable new business models, such as connected car services and self-optimizing machines. Traditional industries, such as manufacturing, machine building, and automotive, are facing a fundamental change: from the production of physical goods to the delivery of AI-enhanced processes and services as part of Industry 4.0. This paper focuses on AI for industrial enterprises with a special emphasis on machine learning and data mining.
Despite the great potential of AI and the large investments in AI technologies undertaken by industrial enterprises, AI has not yet delivered on the promises in industry practice. The core business of industrial enterprises is not yet AI-enhanced. AI solutions instead constitute islands for isolated cases—such as the optimization of selected machines in the factory—with varying success. According to current industry surveys, data issues constitute the main reasons for the insufficient adoption of AI in industrial enterprises.
In general, it is nothing new that data preparation and data quality are key for AI and data analytics, as there is no AI without data. This has been an issue since the early days of business intelligence (BI) and data warehousing. However, the manifold data challenges of AI in industrial enterprises go far beyond detecting and repairing dirty data. This article profoundly investigates these challenges and rests on our practical real-world experiences with the AI enablement of a large industrial enterprise—a globally active manufacturer. At this, we undertook systematic knowledge sharing and experience exchange with other companies from the industrial sector to present common issues for industrial enterprises beyond an individual case.
As a starting point, we characterize the current state of AI in industrial enterprises, called “insular AI,” and present a practical example from manufacturing. AI is typically done in islands for use case-specific data provisioning and data engineering, leading to a heterogeneous and polyglot enterprise data landscape. This causes various data challenges that limit the comprehensive application of AI.
We particularly investigate challenges to data management, data democratization, and data governance which result from real-world AI projects. We illustrate them with practical examples and systematically elaborate on related aspects, such as metadata management, data architecture, and data ownership. To address the data challenges, we introduce the data ecosystem for industrial enterprises as an overall framework. We detail both IT-technical and organizational elements of the data ecosystem—for example, data platforms and data roles. Next, we assess how the data ecosystem addresses individual data challenges and paves the way from insular AI to industrialized AI. Then, we highlight the open issues we face in the course of our real-world realization of the data ecosystem and point out future research directions—for instance, the design of an enterprise data marketplace.
About the Author:
Christoph Gröger is enterprise architect for data analytics at Bosch and a senior technical professional in Bosch’s global data strategy team in Stuttgart, Germany.