Why clean data should be every bank’s New Year’s resolution 

As retail banks increase AI investment, many risk hitting an “AI wall” caused by fragmented legacy systems and inconsistent data. Mahesh Paolini-Subramanya, CTO at BKN301, argues that without clean, governed, real-time data foundations, AI in banking cannot scale safely or reliably in 2026.
Clean data in banking

Why clean data should be every bank’s New Year’s resolution 

As retail banks increase AI investment, many risk hitting an “AI wall” caused by fragmented legacy systems and inconsistent data. Mahesh Paolini-Subramanya, CTO at BKN301, argues that without clean, governed, real-time data foundations, AI in banking cannot scale safely or reliably in 2026.
Clean data in banking

We’re at the time of year when people start making well-intentioned resolutions to reset and refresh – whether that’s to exercise more, use our phones less, or eat more healthily.  

In 2026, retail banks should be making a resolution too: to get their data clean. Retail banking has embraced artificial intelligence (AI) with real energy, where pilots have become launches, and teams have hired AI specialists at scale. We’ve witnessed a period of rapid AI experimentation in the sector, such as Nationwide launching Cora+, its first generative AI (GenAI) virtual assistant to improve customer service, while Lloyds Banking Group has rolled out M365 Copilot to more than 93% of its employees.  

And this shows no signs of slowing. Half of all UK financial institutions plan to increase their AI investment in the next year, after seeing dramatic productivity gains. 

But in financial services, we’re at risk of hitting an AI wall. AI is only as reliable as the data it consumes. Many banks are trying to run next-generation intelligence on last-generation systems. Poor data quality is disrupting operations, consuming time, increasing costs, and undermining confidence in automated outcomes – but if banks give their data a spring clean, meaningful AI adoption will follow.

 


GenAI is probabilistic by nature 

Crucially, GenAI systems generate likely outcomes rather than guaranteed ones. But retail banking runs on certainty. A customer doesn’t want the ‘most likely’ explanation for why their balance has changed; they want the exact reason. But if the underlying data is inconsistent, unclear, or fragmented, then even the best model will amplify that uncertainty at speed. 

The root of this problem lies in legacy data architecture, which is often due to large, incumbent banks growing through acquiring other smaller firms. Most financial institutions still operate in environments designed for transaction processing and periodic reporting.  

This results in fragmented data spread across multiple systems, where different products, such as current accounts, loans, mortgages, and cards, sit on separate platforms with different data models. Over time, digital channels, analytics tools, and regulatory solutions have been layered directly onto core systems. Data is then extracted, replicated, and transformed repeatedly to meet immediate needs. While this has increased data availability, it has also fragmented definitions, weakened lineage, and eroded trust. 

Indeed, most banks have invested heavily in warehouses, lakes, and analytics platforms to try to fix their data dilemmas. But those environments are often built for reporting – backwards-looking views of what happened – rather than a reliable, real-time operational picture of what’s happening right now.

 


Preventing AI initiatives from stalling  

Ultimately, if you don’t trust your data, you cannot trust the outputs. AI initiatives struggle to progress beyond controlled environments because the cost of error is too high. This explains why many AI projects stall, despite strong investment and technical capability. 

The problem is not that AI has failed, but that it has exposed weaknesses in data foundations. This is where the New Year’s resolution mindset is useful, taking a deliberate pause to reset habits, take stock, and build discipline. For banks, that discipline is data; instead of adding yet another layer of AI tooling, banks should commit to the fundamentals that make AI dependable.

 


What clean data looks like in practice 

Clean data is consistent, well-defined, and aligned with business logic. Organisations cannot extract value from advanced analytics or AI unless data is first organised, governed, and made usable across the enterprise 

In practice, this means banks need to standardise data from all backend sources into a clean, continuously updated foundation – consolidating fragmented information into a single, governed architecture.  

This includes automated data quality checks to detect duplicates, missing fields, or any inconsistencies that creep in, and highlighting these gaps before they reach downstream systems. Traceable lineage also makes sure that teams can explain where the data came from, how it was transformed, and who owns it. 

The result of this spring cleaning is both better reporting and better banking. With a clean data foundation, banks can achieve reliable, consistent data powering AI, analytics, compliance, and digital operations. It also results in faster delivery of new services, as teams work from standardised models instead of rebuilding pipelines for every product or use case. Taken together, these benefits lower operational risk by eliminating conflicting data sources, manual reconciliation, and avoidable errors that undermine confidence. 

 


Is it ‘New Year, new me’ for retail banks? 

While it may not be as glamorous as the next shiny new AI pilot, clean data needs to be the resolution that comes first.  

This time next year, we can hopefully look back and see 2026 as the year that banks made AI truly operational, with reliable, consistent data becoming the norm – and being the engine powering AI, analytics, compliance, and digital operations. 

 

Finextra Article

 

Scale with BKN301’s future-ready infrastructure

Scale with BKN301’s future-ready infrastructure

Share this article with:

Related articles