AI runs on data. But many companies still struggle to get their data in good shape. Hugh Burgin, EY Americas AI and analytics leader, said getting your data AI-ready is critical to competing effectively in the age of AI.
The AI Innovator caught up with Burgin to find out what it takes exactly to get one’s data AI-ready. What follows is an edited version of that conversation.
The AI Innovator: Why haven’t companies gotten their data act together yet?
Hugh Burgin: We released our AI Pulse Survey results in December 2024 on the topic of AI, and one of the things they found was that 83% of executives said their adoption of AI would be much faster if their data infrastructure was more mature.
So a lack of capability around data infrastructure is hindering the speed to market for a lot of enterprises. The other factor there was training. About 60% of executives thought that lack of training of employees on AI is slowing down their adoption. I think the data infrastructure investment is important, but I also think that successful deployment of AI should not be dependent on perfect data. Enterprises rarely, if ever, have perfect data available.
There are a lot of use cases you can implement with imperfect data. In many ways, an AI model can learn from data whether it’s perfect or not, and you can get a lot of value out of that. What we call AI-ready data is the focus.
When you make data investments, you should be thinking about AI. How am I going to leverage AI with my data investments, and how are you going to leverage your enterprise data and external data to get value out of that? But they’re still lacking in their efforts. Also, many companies struggle to have it all in one place. Simply having visibility and access to data are key challenges.
We actually asked our AI engineers and data scientists, ‘What do you need from data to make it AI ready? What would help you be more successful in your AI projects from data?’ We summarized it into seven characteristics. It was things like, is it trusted? Is it visible? Is it accessible? Is it recent? Is it from today rather than last month? These are characteristics that on the surface seem obvious, but really it’s what data should be to make it AI-ready, and many companies struggle with that at the holistic enterprise level.
Explain what being visible means when it comes to data.
What I’m referring to is access to search the metadata about the data. How do you know what data is out there? How do you know the business context and the technical context of that data and how can you make it visible so that you can search it and find it and use it when you need it, versus being hidden in a different part of the organization?
That’s especially crucial when you think about all the business functions, whether that’s marketing, financial or supply chain data. If you’re in marketing, you might get a lot of value out of supply chain data. If you’re in finance, you might get a lot of value out of understanding customer data, but it’s not always accessible and visible across the enterprise. You can only see what you use every day.
One of the difficulties of most data today is that it’s unstructured. How does that play into having AI-ready data?
We’re seeing a major focus from companies on what we call a lake house architecture. Think of a data lake as, traditionally, being an open platform that can receive any type of data, whether it’s images or unstructured text or structured data, like finance data, versus a data warehouse that traditionally is highly structured, with columns and rows and predefined attributes.
What we’re seeing a lot of companies invest in is (something else): a lake house architecture, which brings together the best of both worlds so you can have access to your structured data in a highly performant, responsive database, but you can also have access to unstructured data that’s allowing you to get insights that were previously difficult to access – whether that’s customer sentiment or customer feedback, or images of a store shelf. Those things are a lot more accessible now but require a different approach to how you envision your data architecture.
You say that businesses should prioritize an internal AI marketplace to accelerate value. What does that look like?
What we’re seeing companies invest in is a data marketplace, or a data product marketplace, which is a platform to allow users throughout the enterprise to go to a central location. Imagine a digital application where they can actually search for the types of data they need, whether that’s customer feedback data, ERP data, or supply chain data, and they can find it, they can read about it, they can understand it. They can actually subscribe to it. …
There are a few benefits. One is reliability. Rather than having to go to 25 different locations to find 25 different data sets, you can reliably go to one spot and find a source of truth that’s governed and accessible and provided by that business. The other value proposition is efficiency. Rather than having the organization invest in the same data projects throughout, they can go to one place to find that trusted, reliable source of data.
How do you organize the data into one place?
One of the great lessons learned over the last few number of years is the importance of creating the role of data stewards that come from the source of the data, and they have a depth of knowledge in that data source, and they can really be the ones that ensure that it’s the right data coming from the right place, and that they can answer questions about it. Rather than having a central team manage everything on behalf of the entire company.
What are your top trends for 2025?
In 2025, I see companies that are making meaningful investments in the AI space are going to differentiate themselves compared to others. They’re implementing truly transformational use cases. And what we saw was that 80% of companies who invested in AI successfully are planning to invest more than $10 million over the next year. And so we’re seeing companies that are proving that AI works. They’ve completed the experimentation phase. They have successful use cases. Now they’re focused on, how do we truly transform our business or function? I think that’s going to really start to show up in the competitive landscape.
Hardware will continue to get more work-optimized. Data centers will become more AI-powered. Enterprises are going to get more intelligent about what’s the best-fit model for a use case. … Large language models themselves are going to become more effective and efficient, and so we’ll continue to see that cost-effectiveness improve over the next 12 to 24 months.
The other thing I would focus on is the promise of agentic AI. I think agentic AI is an exciting future for the corporate enterprise. And in many ways, we’re already there, and we’re seeing it today with our clients. In other ways, we’re very much in the early stage. People describe agentic AI as autonomous in decision-making, low-code, easy to implement, and less narrowly focused than historical use cases.
Be First to Comment