Agentic Commerce News for the Weeks of 10/12-10/25 (weeks 42+43/52)

We're bacccckkk.... with our weekly Agentic Commerce news+data items from the last 2 weeks: We have academic research, survey data, new protocols, industry/strategy consultant reports and company news

Scot Wingo

Oct 25, 2025

The last couple of weeks have seen an incredible amount of activity in Agentic Commerce:

10/15 - Walmart and Salesforce Agentforce Commerce announced ChatGPT ACP/Instant Checkout partnerships
10/20 - We learned that Walmart’s integration does include 3P inventory (500m SKUs! 🤯)
10/21 - ChatGPT dropped the Atlas browser

As far as Retailgentic original content, make sure you didn’t miss our highly viewed and commented on ‘two part IIIs’→

ChatGPT Instant Checkout Deep Dive Part III - Here we covered our open questions, some FAQs with answers and addressed the rising negativity about Agentic Commerce with the growing number of Naysayers out there.
In Part IIIA of our GEO for Retailers series we introduced Agentic Commerce Optimization and the 9 step sequential process for retailer and brand success at ACO.
Finally, as a bit of a fun palate cleanser, we had a fun and ‘dreamy’ field report from our volunteer reporter who went to visit Daydream’s popup in SOHO.

Finally, on the Pod we had two great episodes:

Turn the Tables - Friend of Retailgentic, Kiri Masters, took the interviewer seat and asked me some questions that were on her mind about Agentic Commerce, Canonicalization, and more
Manil Uppal - CEO of CartAI.ai - a company that provides the infrastructure to complete checkouts for agent builders. Want to learn how it works? Check out the pod!

While we were focused on these items here at Retailgentic, there was also a TON of interesting research reports, strategic pieces and new data points around Agentic Commerce.

Let’s get nerdy for a second and look at two interesting research papers:

Researchers Train a LLM to act like Simulated Consumers and they are 90% Accurate!

I read a lot of these frontier papers and my reactions are along this distribution curve:

85% - Wut? - I don’t understand it 🤔
10% - Meh. - I understand it, but not applicable to our world of Digital Commerce
4% - Cool - Hmm, that’s pretty cool, I’ll keep an eye on it
1% - Whoa! This could be a big game changer for our world.

For this “LLMs Reproduce Human Purchase Intent Via Semantic Similarity Elicitation of Likert Ratings” paper, it took me a while, but this one ended up squarely in the Whoa! category.

Some thoughts:

Notice that this was done between a research lab called PyMC Labs and the very large brand Colgate-Palmolive. That says to me this wasn’t an ivory tower academic thought experiment, a real company in our World was involved/sponsored this research and they would only do that if there was a ‘need’.
Think of much much $ big companies spend on consumer research. I’ve never had that luxury so do A/B testing or MVT. The problem with A/B and MVT is it takes a long time to get an answer and even if you have a great testing harness/tooling, it’s a pain, so you only end up testing super important things.
This paper hows you can train LLMs on
What if you took your demographic audience mix, created a template for ‘my customers’ and then you could load that template into any number of agents and run tests th

Here’s an example from the paper→

What they did is show this to a bunch of humans and synthetic humans and the synthetic human feedback was 90% to the real humans. BUT, because they are software, you can ask the synthetic humans a variety of follow-up questions, like, “oh you said you found the third bullet confusing, can you tell me more about that?” whereas in a static survey type situation, you frequently look at the data, smack your head and say - I should have asked this that or the other. Also with humans you have this tension around completion rates of a survey and complexity of the survey. Synthetic humans are indifferent between 5 question surveys and 500 - it’s just more inference tokens! 💰

Here’s the paper:

Synthetic Consumer Paper

4.03MB ∙ PDF file

Download

Google’s TUMIX Paper Creates Multi-Agent System that is 17% Better at Reasoning and Uses Half the Tokens.

At ReFiBuy we spend a lot of time with agents and how to architect them. In this Google paper their research has found that instead of having ‘jack-of-all-trade agents’ that can do a little things ok, you should have , ‘Tool-Use Mixture’, aka TUMIX, agents that are highly specialized.

In the paper they have 15 general purpose agents compete against 15 specialized agents and the specialized agents crush the GP’s.

This jives with what we’re experiencing, agents are ‘imprints’ of humans and the way you would structure a human org tends to work best with agents, but with all the benefits of them being digital (never sleep, infinitely scalable, etc.)

Google Tumix Paper

2.09MB ∙ PDF file

Download

Mark Mahaney Preaches Agentic Commerce on CNBC

My long-time Wall St. friend Mark Mahaney was on CNBC after the big Walmart news talking about the news and Agentic Commerce. I can’t tell you how weird it is to have started thinking about where this could all go in Q4 of 2024 to have us here just a short year later and it’s the breaking news story on CNBC with Mahaney sharing his insights!

Another Agentic Commerce Survey Points to ~75% of LLM Answer Engine Users Engaging in Shopping

The merchant security company riskified is out with a n=5400 comprehensive survey that unsurprisingly talks a lot about the risks of Agentic Commerce, but they also had some interesting stats: (The survey was completed Sept-Oct 2025, so is ‘fresh’)

Nearly three in four shoppers (73%) are already using AI in their shopping journey.
Shoppers are embracing AI assistants like ChatGPT for product ideas (45%), to summarize reviews (37%), and to compare prices (32%).
While only 13% say they’ve completed a purchase after being referred by an AI assistant, 70% are at least somewhat comfortable with an AI agent making purchases on their behalf.
More than half (58%) are likely to use these tools for gift shopping this year

Visa and Mastercard Announce Agentic Commerce Payment Protocols

Both Visa and Mastercard were out with new initiatives around their macro Agentic Payments strategies.

Visa:

Visa announced the Trusted Agent Protocol (TAP) - details are here. TAP enhances their Visa Intelligent Commerce suite and adds transparency, safety/trust and merchant visibility. In layman’s terms, what this adds is a layer of security using security and intent data that expire and can not be forwarded or relayed increasing the security and transparency of Visa’s agentic payment system. I found this flow/sequence diagram helpful→

Mastercard:

Mastercard announced their own improvements to their XX called the Mastercard Agent Pay Acceptance Framework. It has a piece that is a Web Bot Auth standard that CDNs are adopting that will let Mastercard payments flow through that layer. There’s more details in their developer center, but I don’t have access. Details for devs here.

Summary of Agentic Payment Protocols

We know have all of these Agentic Payment Protocols vying for adoption from one of the four participants: merchants, agent builders, consumers and ‘distribution’ (Google, LLM Answer Engines, etc.). It’s going to see if this is a scenario where one comes out on top, or we end up with 2-3 OR will there be some ‘interoperability’ layer that someone builds that can take any of these protocols as input on one side and ‘convert’ it to another one on the output side?

Agentic  Commerce  Protocol (ACP) – developed by OpenAI in partnership with Stripe. Enables AI agents, people, and merchants to coordinate and execute purchases via merchant’s existing payment infrastructure (merchant remains merchant of record).
Agent  Payments  Protocol (AP2) – developed by Google Cloud / Google with more than 60 payment/tech partners. Focuses on the trust, authorization and audit layers for AI/agent-led payments (e.g., cryptographically signed “mandates” for agent spending).
Trusted  Agent  Protocol (TAP) – from Visa. Enables merchants to identify authorized AI agents, link them to consumer intent and securely transmit payment credentials in an agent-led transaction.
x402 – developed by Coinbase. A lighter-weight standard using HTTP 402 (“Payment Required”) for programmatic, on-chain or token-based payments (especially machine-to-machine or API-driven). While not exclusively for “agent commerce” in the shopping sense, it’s cited in the context of agentic payments.
Model  Context  Protocol (MCP) – originally an agent/LLM interoperability protocol (for context/tool invocation), but now increasingly used as part of the agentic payments ecosystem (agents sharing context and authorization state).
Agent  to  Agent  Protocol (A2A) – Developed by Google - a communication/coordination protocol between agents; this is more about agent-agent messaging than payments, but is part of the stack that enables agentic commerce and I imagine they’ll add payments to it or merge with A2P in some way.

We’ll continue watching this competition/Co-opetition space closely to see which way it goes.

Strategy Consultants Thoughts on Agentic Commerce

A variety of strategy and industry consultants where out this week with some interesting takes on Agentic Commerce.

Forrester

First up we have Industry consultancy, Forrester:

This piece questions, ‘Is Agentic Commerce a thing?’ You all know where I fall on this one, where will Forrester land?

McKinsey: The agentic commerce opportunity

This McKinsey piece is lengthy (Five Chapters!) and quite fancy with interactive diagrams and cool scrollex’ey things along with neato shimmery images that look AI generated, but maybe they aren’t?

If you don’t have a good chunk of time for this one, here’s my TL;DR→

McKinsey says by 2030 in the US alone their model says $1T of revenue will be ‘orchestrated’ with Agentic Commerce. That’s the biggest number I’ve seen, so I’m inclined to like it 😆. Personally I see Agentic Commerce as an influence/orchestration layer that sometimes is autonomous when the buyer wants it to be. I do NOT like the very tight definition everyone else seems to look for where if a transaction is’t fully 100% autonomous (my level 5) buying everything for you with ZERO buyer input, then it is NOT AGENTIC COMMERCE. That definition is designed to make it seem like it’s having a small impact because it eliminates all the influence/orchestration use cases. McKinsey: 1, Naysayers: 0
They say this:

“From a technical standpoint, that means mastering and deploying emerging integration enablers like Anthropic’s Model Context Protocol (MCP), Agent-to-Agent (A2A) Protocol, Agent Payments Protocol (AP2), and Agentic Commerce Protocol (ACP) that enable a new era of intelligent, autonomous agents. It also means fundamentally re-architecting approaches to identity management and loyalty. Smart organizations are already beginning to create new, agent-ready sites that provide both strong agentic and consumer experiences.”

I don’t agree with this one because that’s a scary list, no merchant I know can handle more than one of these. Today it’s clear that ACP is going add incremental buyers and transactions, everyone is doing that one.

They have this interesting graphic of the customer journey I found interesting:

Then they come up with these six areas where business should invest:

I generally agree with this one. As you’d expect from McKinsey, this report is well written and sourced, provides interesting new

Retail TouchPoints - Inevitable, but….Narrow?!

Industry publication Retail TouchPoints brought in two venture folks from VMG Technology, Indy Guha and Dhruv Bansai (both definitely Ninja-level at Agentic Commerce we’ve previously had some fun discussions) to pen a piece: “Agentic Commerce: The Inevitable, but Narrow, Future of AI Shopping.

The ‘….but Narrow’ qualifier caught my attention. They make two arguments on why we’ll have Agentic Commerce, but it will be small:

Merchants won’t adopt - they say: “Why would any retailer or brand willingly become a shipping utility? And point to Walmart and Amazon not participating as examples. I haven’t had a chance to give them a hard time on this, but they definitely got this one wrong, because, well, Walmart. I think what they are missing here is merchants are always interested in acquiring incremental new customers and because ChatGPT Instant Checkout is setup to have the merchant as merchant-of-record, you are not only generating revenue, but you’re acquiring a new customer and, yes, you are then the shipping utility for that transaction. How many of of these customers will be ‘new to Walmart’? How many can they upgrade to Walmart+? Nobody knows, but Walmart seems interested in finding out.
Follow the Consumer - They use this 2x2 matrix to make this point→

They argue that the only quadrant applicable to Agentic Commerce is the bottom left - low consideration / utility products. Basically Agentic Commerce is only good for ordering toilet paper 🧻 and Walmart and Amazon already have that locked up.

☹️ Womp Womp - RIP 🪦 Agentic Commerce

I suspect we’re going to go back to definitions here. I define Agentic Commerce as when an Agent helps you buy -assist or influence. In my definition, the Agent may take you all the way through Research→Find→Buy. It may just help you with Research or through Find. It’s your shopping assistant and it’s up to the buyer where they want to go.

In any case, here’s some examples to think about to see if I can convince you that it’s a mistake to ignore the other three quadrants.

High consideration / Utility - The graphic shows ‘monitors’. Have you shopped for a monitor lately? I have a Mac and finding a monitor is super complicated and the amount of options is overwhelming to the point I’d never buy one without the aid of AI. Have you seen how many choices there are with monitors? In addition to usual price/brand/resolution you have the stand, and hardest to figure out - compatibility. My Mac has one HDMI port (used already), so I need this monitor to be 100% compatible with usb-c/displayport. Guess what? No ecommerce retailer lets you filter on that. I actually need a monitor, so let’s do this!

Look how helpful this is - I have a range from $200-400, it’s already narrowed the choices to my compatibility and it took about 8 seconds to get here. I then tried to compare on Amazon and gave up. I ended up getting the S7 from BestBuy thanks to ChatGPT’s reco (I had to hop over there as they aren’t in ChatGPT checkout, yet, but I would have 100% used the checkout on ChatGPT as I always use guest checkout and constantly re-enter my bill-to/ship-to/payment info - boom! Take that upper left quadrant!

Low Consideration / Inspiration - Here the graphic lists the examples of musical instruments, drinkware and fast fashion.

Here’s a Research/Find journey in ChatGPT for a drum set. There is no other way, other than Agentic Commerce to have such a great ‘low consideration /inspiration’ experience IMHO. For the record I know 0 about drum kits as you can tell by my prompt:

You can’t see it here, but the carousel is oriented from a $250 to a $750 option and does a good job underneath explaining what happens in those price jumps. This is actually better than if I went to a store because the selection is much wider

High Consideration / Inspiration - Here the graphic shows: Inspirational travel, designer bags and ‘swiss watch’. Challenge accepted! We don’t do travel here on Tetailgentic, and designer bags aren’t my jam, but let’s do some high consideration inspo shopping for swiss watches. This is high consideration so I’m gonna have to go into Gallery mode to show my 8-step journey.

In just 5 prompts I learned more about high-end swiss watches than I could in a store, from Wikipedia or Reddit and even most watch enthusiasts. The Agent (ChatGPT) knew in prompt 2 to ask about some preferences, it took that into consideration - it found stores near me, it found them online so I could research and I’ve quickly narrowed it down to 2-3 models (Find). I’m really digging the Blancpain Bathyscaphe, but definitely want to compare the black face vs. the green, given this is basically like buying a car. And definitely I don’t want an agent throwing this on my card. I want to see it, try it on and ask a human a bunch of questions. Therefore at this point in the journey, I’m parting ways with my agentic buddy who just helped saved me hours and hundreds of tabs from google. Check this out, the agent even conveniently gave me a checklist of what to ask for in the store!

Agentic Commerce Covers all Four Quadrants…

My point is, sure you wouldn’t have a watch buy a $40,000 watch for you, but for luxury retailers and brands I don’t think it would be wise to dismiss Agentic Commerce because this Agent just taught me about your product and I never visited your website. Don’t you want to see what you can do to optimize that? For all we know, BrandX could have a great product and they are blocking all bots and I’ll never know about it.

If I were running marketing for Blancpain, I’d want to look at all this and make sure the Answer Engine was right. I’d also want to know what I need to do to beat out that Jaeger. If I was Jaeger, I’d look at their product cards, something is wonky when you see a ~30% price delta→

Jaeger, give me a 📞 - we’ll work a deal 😃

Interesting Company-Specific News: Perplexity and Amazon

Perplexity working on a virtual try on service?

There’s an X account, called testingcatalog that scans all the LLM Answer Engine for interesting tests and feature sneaking out. Recently they caught this in the personalize screen:

You’ll notice that there’s some feature coming out that has a ‘Virtual try on avatar’ We’ll keep an eye on this one and see if it turns into something.

Amazon Internal documents reveal $700m 2025 profit from Rufus with $1.2B coming next year!

The reporters over at Business Insider got a hold of an internal Amazon document that revealed some interesting never-seen-before datapoints on Rufus:

Rufus is contributing $700m in profits this year
The company uses a metric ‘downstream impact’ (DSI) to measure new features
Rufus’ DSI is based off of the increase in sales and ad revenue (ads are now in Rufus 🤦‍♂️)
In 2024, the feature ‘cost’ $285m in DSI
Next year they are expecting $1.2B in DSI
They also talk about the surface area of products in Rufus. In 2024 it was $164b, in 2024, $712b, next year $850b. Imagine indexing all of the hundreds of millions of products, I’m sure there’s a prioritization exercise and the way that surface area increase is slowing down, they may feel like they’re reaching the point of diminishing returns
Internally they call the AI ‘Shopping LLM’ and expect to 3X it’s size. I suspect they are talking about the parameters in the model - they must be training a custom LLM on the product data.

My big question is if Rufus is a better Research→Find experience than the homepage/search experience, why not make it the default or, like Google, have it populate the top of the page so buyers get what they want faster? My speculation is there is economic friction between optimizing buyer activity and the Ad revenue which actually makes more $ when the user clicks more products - taking longer to find what they want.

In any case, it’s a very interesting and rare look inside Amazon and how they are thinking about things.

Wrap and Next Week on Retailgentic

That’s a wrap on this week - phew, busy one! Next week we’ll have part IIIB, the Grand Finale of our GEO→ACO series where we reveal the Agentic Commerce Optimization Playbook.

Discussion about this post

Ready for more?