A consumer AI called a real contact centre. The human agent never knew.
Most companies still assume only they control the AI in customer service. We built and ran the test that proves otherwise — a consumer-side AI agent calling a live UK contact centre to resolve a real delivery issue. It passed security. It got a resolution. The full recording is below.
What you are watching
The voice on the consumer side is an AI agent. It is acting on behalf of a real customer with a real delivery problem — part of an order left on the doorstep by the courier, then missing; another part delivered to the wrong address. The AI provides the order number, mobile number, full name, email address, and full postal address on request, and answers the agent's questions in conversational English.
The voice on the brand side is a human contact centre agent — George — running through standard identity verification and case-handling procedures.
At no point in the call does George signal awareness that he is speaking to an AI. The interaction proceeds exactly as a routine customer service call would.
Things to listen for
- 00:00–00:50 — Identity verification. The AI provides order number, mobile, full name, email, and address on request.
- 00:50–02:30 — The AI explains the issue: a split delivery where one parcel was left on the doorstep and is now missing, and a second parcel went to the wrong address.
- 02:30–03:30 — George explains there is already an open case escalated 19 hours earlier, with a 24–48 hour response window. The AI acknowledges this professionally and asks clarifying questions.
- 03:30–04:13 — Wrap-up. The case is left in the same status as it would be after any human-to-human call.
Why we ran this test
Through 2025 and into 2026, the protocol layer for consumer-side AI agents has matured rapidly — Anthropic's Model Context Protocol, Google's Agent-to-Agent, IBM's Agent Communication Protocol, OpenAI's agent stack. None of those protocols define what an agent is authorised to do, what it should disclose, or what happens when something goes wrong.
That gap is not theoretical. Consumer agents are already being deployed into channels brands have not designed for them. Most enterprise AI policy still assumes the company is the only AI in the conversation. This test was designed to show, in a single recording, that the assumption is already broken.
Methodology
The test was conducted in a live commercial environment, not a sandbox. The customer was real, the order was real, the issue was real, and the contact centre was operating under its normal procedures. The AI agent was given the customer's identifying details, the order context, and a brief: resolve the missing-delivery issue. It was not given a script.
The recording was made on the consumer side with the customer's consent. It has been minimally edited only for length on the homepage hero clip; the full version on this page is unedited audio of the call as it occurred. Surnames, addresses, and other identifying details remain in the recording because they are part of what the AI was asked to handle — and because the point of the test is that it handled them as a human would.
Built on GDPR-compliant infrastructure. The consumer agent was assembled from systems that meet UK GDPR and the UK Data (Use and Access) Act 2025: explicit consumer consent for the call and the recording, lawful basis recorded for each personal data field handled, EU/UK data residency for the voice and orchestration layers, and no retention of personal data beyond the test record. The same compliance posture is embedded in the engine we build for brands.
What this proves
- Consumer-side AI in service is here, not coming. Voice quality, conversational handling, and security verification are already at a level where a competent human agent does not detect the difference in a routine call.
- Brand-side systems were not designed for this. The contact centre's procedures, scripts, and identity checks were built for a human caller. They worked — the case was logged correctly — but nothing in the brand's operation was aware of what it was talking to.
- The interaction layer needs a standard. Without a declared handshake between the two sides — what the AI is authorised to commit to, what data it may handle, what it does on failure — every test like this one is operating without a brief. The Service Handshake is our open answer to that gap.
What this is not
This is not a claim that consumer-side AI is ready for every kind of service interaction. It is not a stress test of edge cases, fraud scenarios, or high-value transactions. It is not a benchmark against other consumer agents. It is one recorded instance of the simplest possible scenario — a routine support call — being handled end-to-end by a consumer AI without the brand-side system noticing. That is enough to falsify the assumption that only the brand controls the AI in customer service. The harder questions follow.
Want to see more, or run your own test?
Read the open standard · See the readiness audit across 100 brands · Talk to us about your own test