PhoenixTeam

Everyone is hating on the the MIT study published in July, which claimed that 95% of organizations are getting a zero percent return on their genAI investments. This report, published by the MIT Media Lab, has been extensively debated by both critics and advocates, including some of the most recognized and respected voices on the AI circuit.

The 2.27% current impact represents the portion of total possible value that organizations are realizing today from agentic AI. (iceberg.mit.edu)

‍

"Despite $30-$40 billion in enterprise investment into GenAI, this report uncovers a surprising result in that 95% of organizations are getting zero return.... Just 5% of integrated AI pilots are extracting millions in value, while the vast majority remain stuck with no measurable P&L Impact."

That's a pretty spectacular claim. I certainly agree that finding the return on investment (ROI) is harder than expected, and I have seen teams swirl looking for that spectacular 2-4x ROI on one or just a handful of use cases. I think this study also ignites fear in all of us product companies looking to really make a difference in mortgage. We don't really want to talk about how hard it is to find meaningful and lasting change. So let's just put that out in the open.

What does the article actually say?

The main point is to argue that the key differentiator between success and failure are systems that learn. It argues that the classic ChatGPT model of assistive or conversational AI is great for short thinking tasks, and falls apart for long thinking due to lack of memory. It argues that agents are necessary to achieve real organizational value, and that there is a window of about 18 months to settle on partnerships that will help organizations really capitalize on the AI advantage.

Article content — I don't think these conclusions are wrong. In fact, I agree. However, I think they are, at best, weakly supported by a sparse set of anecdotal data in a study that has an agenda.

So basically it's a study to put data behind the claim that agents are the key to real value unlock, and that the time is now to seize the advantage. That's the bottom line, and I think it's useful. Yes, there are many reasons to hate on the study, but the bottom line strikes me as mostly valid.

Better than a bunch of hallway conversations?

The study was based on 52 interviews across "enterprise stakeholders", a "systematic analysis" of about 300 public AI initiatives, and surveys with 152 leaders. Not a super big or scientific study from my perspective. But still, let's put away the pitchforks. It's better than nothing, right? I think some of the best insights are revealed in the quotes.

"The hype on linked in says everything has changed, but in our operations, nothing fundamental has shifted." Little bit of victim mentality here but ok, yes there is a lot of hype and the PowerPoints do not agree with what is actually happening.
"If I buy a tool to help my team work faster, how do I quantify the impact? How do I justify it to my CEO when it won't directly move revenue or decrease measurable costs?". Preach - this is like THE problem. We only count two types of beans in mortgage - headcount and revenue. One has to go down and the other has to go up. otherwise we have no ROI.
"[ChatGPT is] excellent for brainstorming and first drafts, but it doesn't retain knowledge of client preferences or learn from previous edits. It repeats the same mistakes and requires extensive context input for each session. For high stakes work, I need a system that accumulates knowledge and improves over time." Yes and no on this one. The more I use ChatGPT, the better it performs relative to what I want it to do. It does anticipate what I will ask, and I have to provide less context. But yes, on an individual question basis, memory is an issue.
"I can't risk client data mixing with someone else's model, even if the vendor says it's fine". This is completely true and I hear it all the time.

A high bar defining success.

The report had a pretty high bar for the definition of success. Said in my words, success is defined as meaningful impact on the P&L, measured six months post deployment. Keep in mind, this wasn't actually measured, this was based on what those interviewed or surveyed said.

I've been a large scale commercial software product manager for a lot of my career. I've had many glorious successes and just as many spectacular failures. By this definition, I'm sure at least some of my successes would be failures. And if you consider what it takes to move in federal, I think success would be even more scarce. This definition applies to a narrow spectrum of small, turnkey, commercial solutions where you can turn it on and see immediate P&L impact.

While this is definitely the goal for all of us, I'm just not sure it's a realistic definition for the rest of the world. Or maybe I'm the one with the outdated perspective (ok, ok, probably it's a me problem and I am being defensive). I do base a lot of my experience on what the process has been like in the past. I certainly agree that in a world where we can go concept to cash in a week, we should be able to move the needle on the P&L in a matter of months.

Learning systems and the agentic web.

The authors are from the Networked AI Agents in Decentralized Architecture (NANDA) team at MIT. NANDA is a research initiative focused on how agentic, networked AI systems will impact organizational performance. They conduct research and host events that explore the future of what they call the agentic web, defined as "billions of specialized AI agents collaborating across a decentralized architecture".

Agentic AI according to NANDA researchers is the class of systems that embeds persistent memory and iterative learning by design, directly addressing what they see as the learning gap in assistive AI solutions like ChatGPT and wrapper based AI solutions.

That is also a high bar, in terms of the definition of an AI agent. In my classes and workshops, I typically define an AI agent as having four key characteristics, the ability to:

Perceive, understand, and remember context.
Reason about a problem.
Plan and take action.
Use tools.

I adapted this definition from Jensen at NVIDIA GTC earlier this year, so admittedly maybe it's time for me to evolve my definition, it has been about four months or so. I like my definition because its easy to communicate and remember, and it is easy to contrast from assistive or conversational AI. But just because it's easy, doesn't make it right. NANDA has a much more complicated perspective, resting on a foundation of what they call decentralized AI.

This idea of an agentic web resting upon a network of decentralized AI systems is complicated, and requires a level of technical sophistication that I really don't have. But I get the concept, and it makes theoretical sense. It just seems... really hard. It requires a lot of humans (?) to do a lot of sophisticated things around the world. Meanwhile in mortgage we are still just trying to figure out agents beyond the call center, research functions, and development acceleration (where agents are well established).

The five myths about genAI in the enterprise.

This I did find useful. It was a little section that did a good job painting the picture of common myths in genAI, some of which I agreed with, and I did stop and think about all of them.

Myth #1: AI will replace most jobs in the next few years. Yeah, no. Certainly across all major technological disruptions in the history of disruption, jobs become obsolete and new jobs were created. We are seeing fewer jobs for entry level team members. The Stanford Digital Economy Lab, using ADP employment data, found that entry-level hiring in “AI exposed jobs” has dropped 13% since large language models started proliferating.
Myth #2: Generative AI is transforming business. The study suggests that adoption is high but transformation is rare. I can echo this sentiment, this is what I see as well. I see very little truly transformational adoption in our industry.
Myth #3: Enterprises are slow in adopting new tech. The study indicates this is a myth, but then goes on to say "enterprises are extremely eager to adopt AI and 90% have seriously explored buying an AI solution". Exploring and adopting are just not the same thing, so I disagreed here.
Myth #4: The biggest thing holding back AI is model quality, legal, data, and risk. They argue as I've pointed out already that it's the lack of system learning that is the biggest barrier. This may be true, but model quality is a real problem, and it's what I hear about the most. the question is get the most often is "how do you know it's right" (followed closely by "is it safe"), so I don't 100% agree with the authors sentiment on this one.
Myth #5: The best enterprises are building their own tools. They state that "internal builds fail twice as often". This one is hard for me to substantiate as I tend to work with organizations that buy and build, perhaps with a slight lean towards the buy side. Naturally, this means my perspective will be skewed. I did find this fascinating, though, and in theory it makes sense. I'll have to dig into this one more and see.

Bottom line for us mere mortals in mortgage.

So cutting through all the jargon and the NANDA rabbit holes I explored through my study of the study, here's what I take away from all this for us in mortgage AI.

We need to redefine, or at least expand, our definition of an agent and think more thoughtfully about agentic AI. I continue to believe we have to start where we are, and agentic AI adoption is still very, very early for us in mortgage. I continue to believe that starting with good CI/CD pipelines for retrieval augmented generation (RAG) is the right foundation to start with for organizations that want to build. I don't generally advise organizations to skip steps, but maybe I should.
We need to be even more aggressive about agentic AI adoption and seeking out high-value, low complexity agentic use cases. I am already doing this, but not with enough vigor so I will place more emphasis on this.
We are still really early. The authors argue there is an 18-month window of opportunity. This is likely true in other industries but based on what I am seeing, our window is a bit longer, say 24 to 48 months. Longer in federal, of course. But it's coming.
And finally, perhaps most importantly, we have to continue to double-down on adaptive systems, and continue to incorporate what we learn and see from the actual operations into everything we build. This applies to our product strategy, our educational strategies, and workforce transformation considerations as well. This will take more thought and introspection, I'll let you know what I land on.

By Tela Mathias, Chief Nerd and Mad Scientist, PhoenixTeam

Insights

What does the infamous "MIT study" really mean to us in mortgage?

What does the article actually say?

Better than a bunch of hallway conversations?

A high bar defining success.

Learning systems and the agentic web.

The five myths about genAI in the enterprise.

Bottom line for us mere mortals in mortgage.

Accelerate Your Operations with AI-powered Expertise

Recent Insights

What does the infamous "MIT study" really mean to us in mortgage?

October 28, 2025

Blue Phoenix Awarded $215 Million VA Loan Guaranty DevSecOps Contract | A PhoenixTeam and Blue Bay Mentor-Protégé Joint Venture

October 3, 2025

Ten Not Very Easy Steps to Achieve AI Workforce Transformation

September 29, 2025

Stay Updated.

SOC 2 Certified

© 2025 PhoenixTeam. All rights reserved. | Privacy Policy | Terms of Use