FICO’s CAO on Why Focused AI Beats Generic LLMs in Finance

FICO is rolling out what it calls a domain-specific alternative to general-purpose large language models, aiming to address one of financial services’ biggest concerns with generative AI: accuracy and accountability.

The analytics software and credit scoring giant has introduced the FICO Foundation Model for Financial Services, which includes a focused language model and a focused sequence model built on curated financial data rather than broad internet-scale training sets.

FICO says the models are designed to reduce hallucinations, improve auditability and operate with up to 1,000 times fewer computing resources than conventional large language models.

In this email Q&A with The AI Innovator, FICO Chief Analytics Officer Scott Zoldi discusses how domain-specific models could lift fraud detection and compliance performance, why responsible AI standards are becoming central to ROI, and how the company sees smaller, focused models reshaping AI adoption in financial institutions.

The AI Innovator: What are focused sequence models and why will these systems make agentic AI more reliable?

Scott Zoldi: A focused sequence model (FSM) is a very specialized model that leverages a customer’s entire transaction history to provide the greatest amount of insight as to whether the current transaction would be fraudulent, risky, compliant with collections practices, or any of a multitude of events.

FSMs leverage the entire context of the customer’s experience and transactions, and the benefits of a specially designed transformer model, to return extremely accurate results. This is in stark contrast to traditional transaction analytics systems that summarize a customer’s transactional behavior in a profile, and use that profile and the current transaction to make a prediction.

With FSMs, lifts – or improvements in detection – are significantly improved and false positives substantially reduced, presenting a huge advancement in how the financial services industry will address transaction analytics problems.

In addition to using the full customer context, the specialized transformer architecture that produces the probability score is composed of both contrastive and supervised heads. By that I mean, the contrastive head is essentially making the decision whether or not this transaction is out of pattern, or unusual for that consumer.

That’s a signal that we can incorporate into the supervised head to make a supervised prediction as to whether the transaction at hand is fraud, scam, credit risk, attrition, churn, hardship and many others. In this unique way, FSMs also benefit from a merging of unsupervised and supervised methodologies.

To answer the second part of this question, FSMs will make agentic AI more reliable because agents will be trained on highly focused very specific, narrow tasks.

In using an FSM, we are not asking for a generic agent to assert if customer X might be defrauded. We are developing a specific model with only one objective: to determine if customer X is experiencing third-party fraud based on their entire transaction history, and a large corpus of transaction data with outcomes on which to train these specialized agents.

FSMs will be task-trained for each specific task that needs to be executed; for example, a second agent may tell us whether customer X is perpetrating first-party fraud, and a third agent, whether customer X is being targeted for a scam event.

Agentic FSMs are highly specialized, leveraging multiple consortia of data and billions of transactions to task-train these models. Because they are finely specialized, FSMs automatically move agentic systems from providing generic advice to making laser-focused predictions.

In this way, we can see groups of FSM agents working together to solve problems in a coordinated, accurate, and auditable way, generating low false positives and high levels of customer satisfaction. Importantly, these agents meet the requirements of responsible AI; auditability of their development and ongoing monitoring of the agents provides sustained confidence and interpretability in the decisions they make.

FICO believes small language models could be the key to supercharge AI deployment at financial services firms. Explain why.

One of the tenets of developing and using analytic models is that we must have control of the data on which the models are built. This is a fundamental truth in data science and AI.

When people consider using large language models (LLM) generically, they experiment and do proofs of concept (POC), but those POCs often fail to lead to any sustainable value. One of the reasons why is because organizations do not feel comfortable applying LLMs, and they should not, because they have no control over how the base model was developed or how it may change over time.

Other reasons include not knowing how the models control or measure bias, if the model is representative of the population on which it is being applied, how should the model be monitored, and when should we not trust or ignore the output. Even if the LLM provider offers access to the weights, it doesn’t tell you anything about the data on which these models are built or the bias within, and the possible risk you are taking by using them.

It’s critical to have control of the data on which the model is trained, because otherwise you will not know if the data is representative of the population you want to score. You can’t understand possible representation bias and other risks, such as sufficient data coverage. And one should be careful not to be misled by using RAG (retrieval augmented generation) or fine-tuning, as you can’t know how much of the model’s prior learning will end up impacting the decision it makes in your application.

Techniques such as RAG or fine-tuning have been shown to perturb the weights, and can make the models more unstable and less reliable. These enhancement techniques add a ‘self-inflicted’ form of bias and don’t address the underlying core biases and risks of using AI in applications where you can’t audit the data.

Small language models (SLM) will supercharge AI development because data scientists can build these models from scratch, and define the data that will be used to train both domain and task versions of the language models. We refer to these models as focused language models (FLM), which are essential for responsible AI applications.

There is a misconception that building models requires a huge server farm of GPUs, or specialized, dedicated servers, reserved only for the largest companies that can afford them. SLMs level the playing field because organizations can build their own, such as FLMs with auditable transparency and within reasonable cost.

This will empower a boom in organizations shifting from trying to prompt, reason with, and mitigate bias in generic LLMs, to solving their pressing gen AI problems with proper data science principles and responsible AI.

These models can be built from scratch, provide transparency, and achieve sustained value as part of a mature data science practice. Prepare for organizations to roll out their own in earnest, and in doing so meet responsible AI practices to cross the chasm of gen AI adoption. Moreover, their results will outperform the largest models, as they use specialized FLMs to solve specific tasks that will drive business value.

How can blockchain de-risk agentic AI systems?

In the near future, we will solve complex problems with a multitude of agents. These agents need to be coordinated, and a full audit trail is needed as to how the agents interact and respond to environmental stimuli and the data passed to them. Agents are allowed to adapt to their local environments, which means that the agent’s behavior can be driven in two ways: by the complex relationship between the current environment and, importantly, the history of what the agent has observed ahead of solving a specific task.

This all gets very complex when many agents are involved in a workflow. There is no way to reconcile the final decision without an audit trail that stores the data required to provide transparency as to how the task was solved. This entails understanding the state of these agents in the moment they made the decision, particularly if they are adapting over time, and are functions of data unavailable in the immediate task being completed.

To solve this problem, in comes blockchain. I’m a big believer in this technology, and have been granted two patents on the use of AI blockchains to ensure the responsible development of AI models, and the responsible monitoring of AI systems. These concepts will analogously apply to agentic workflows. A blockchain will be required to explain how these agents were developed, the sort of data they process, tools they access, and what needs to be monitored and behaviors enforced before the agent can complete its task.

For example, if I prescribe a decisioning task that is broken down across six different agents, we need to codify which agents were involved and what their internal ‘state,’ or operating model, was at the time of the task completion, as well as the data passed and environment at the time they made their decisions. Each agent informs the final decision.

Blockchain is ideal for tracking all this data, providing the audit trail needed for high-stakes decisions where transparency and interpretability will still apply, regardless if solved by an agentic system. In fact, agentic systems will need more controls and audit than traditional AI, where controls are more easily enforced.

Blockchain can be used to enforce safety controls and monitoring practices to ensure the agents can be constrained – limited in autonomy – to operate within tightly monitored parameters. This makes blockchain a critical component in being able to trust agentic output, which will be time- and state-dependent, and often very complex to interpret, given the multiple agents’ coordination in completing complex tasks.

What’s ahead for FICO in AI in 2026?

FICO is tremendously inventive when it comes to AI, and AI innovation is incredibly important to me, personally. I am always excited to be in the data science lab, adding to my patent count of 116 AI and software patents granted, and 46 pending.

In 2026, FICO’s data science team will work to refine and operationalize AI blockchains that will empower a safe, responsible, and reliable agentic AI framework. I find it incredibly satisfying to develop the fabric of trust that will empower financial services organizations with safe, responsible, and reliable agentic AI value.

Another focus is on developing better ways to explain gen AI. Many organizations have looked at mechanical interpretability, which happens to be adjacent to the concept of interpretable AI models, which I pioneered more than a decade ago for financial services organizations and FICO’s internal neural networks. I believe there is much work to be done to advance the interpretability of FLMs and FSMs. This includes addressing the concept of mechanical interpretability for FLMs and FSMs use cases, and taking advantage of their very specific tasks performed.

A third area of keen interest is good old-fashioned curiosity, AI curiosity. My team and I are extremely motivated to allow our AI to become more like our human brains. A big part of that innovation is curiosity, and really thinking about how we retain information and forget. We will see more algorithms reflecting those concepts in the future.

AI today is still tremendously naïve; we’re solving problems often through brute force and scaling laws, but we’re not changing the science and the physics of how these models should learn. As we work to address this challenge in 2026 and beyond, it will be amazing to see how these technologies will advance, while keeping AI robust, interpretable, ethical and auditable. That’s what responsible AI is all about at FICO.

Author

Deborah Yao

Deborah Yao is the editor-in-chief of The AI Innovator.

View all posts