← Back to home
Field test — 2025–2026

What we found when we built a customer-side AI.

Most companies assume only they control the AI in customer service. In 2025 we wanted to explore the customer side, so we built and ran an AI on real orders, with real problems, into real contact centres. The clip below is from one of the calls. The thing we didn't expect to find is further down the page.

Test ID 003 — Live contact centre call Duration Average 1 minute 58 seconds per call (3 calls) Date March 2026 Channel Inbound voice call to UK retailer support line Consumer side AI agent (ElevenLabs voice, custom orchestration, Zendesk system of record) acting on behalf of the named consumer Brand side Live human agent on the retailer's standard support queue Outcome Identity verified, delivery issue logged, case escalated, follow-up promised within 24–48 hours

What you're watching

The voice on the consumer side is an AI agent. It is acting on behalf of a real customer with a real delivery problem — part of an order left on the doorstep by the courier, then missing; another part delivered to the wrong address. The AI provides the order number, mobile number, full name, email address, and full postal address on request, and answers the agent's questions in conversational English.

The voice on the brand side is a human contact centre agent running through standard identity verification and case-handling procedures.

The interaction proceeds exactly as a routine customer service call would. The case is logged, an existing escalation is acknowledged, a 24–48 hour response window is confirmed.

How we set it up

The test ran in a live commercial environment, not a sandbox. The customer was real, the order was real, the issue was real, and the contact centre was operating under its normal procedures. The AI was given the customer's identifying details, the order context, and a brief: resolve the missing-delivery issue. It was not given a script.

The recording was made on the consumer side with the customer's consent. The version on this page is a short cut from one of the three calls; the average call length was just under two minutes, with most of that time spent on the brand side checking systems. We share the full recordings with brands who get in touch. Surnames, addresses, and other identifying details remain in the clip because they're part of what the AI was asked to handle — and because the point of the exercise was to see how the call worked when those details were present, exactly as a human caller would provide them.

Built on GDPR-compliant infrastructure. The customer-side AI was assembled from systems that meet UK GDPR and the UK Data (Use and Access) Act 2025: explicit consumer consent for the call and the recording, lawful basis recorded for each personal data field handled, EU/UK data residency for the voice and orchestration layers, and no retention of personal data beyond the test record. The same compliance posture is embedded in the engine we build for brands.

The thing we didn't expect to find

The headline finding from the recording is the easy one: a customer-side AI can call a contact centre, pass identity checks, and work a routine issue through to resolution. We expected that. The interesting observation came from listening back to the calls.

The AI set the temperature of the interaction. The human agent met it there.

Across the calls we ran, the contact centre agents matched the pace and tone of the AI — calm, even, objective. There was no rising emotion when the conversation got into specifics about the missing parcel, no defensive posture, no audible frustration. The AI didn't bring frustration to the call, and the agents didn't bring any either. The interaction stayed steady from the first second to the last.

The calls were also notably more efficient than a comparable human-to-human call would have been: less hedging, fewer detours, faster to resolution. The AI provided every requested piece of information on the first ask, didn't repeat itself, didn't talk over the agent. The agent, in turn, worked the issue cleanly and moved on.

If you've ever been on the receiving end of a tense customer service call, the absence of tension is the thing you notice. The AI in the loop, on the customer's side, made the interaction better for the human on the brand's side — calmer, clearer, easier to handle. That's a finding worth sitting with.

The everyday version of this is already happening

The voice channel is the dramatic case. The everyday case is quieter, and it's already routine. Customers are using ChatGPT, Claude, and Gemini to draft chat messages and complaint emails before they reach your inbox. They're pasting your replies back in to translate, summarise, or check for the catch. Increasingly they're using AI to think through how to phrase a complaint for the best outcome — which channel, which words, which escalation triggers.

None of this surfaces in standard CX analytics. By the time a human agent or a brand-side bot sees the message, the AI involvement has already happened. The call we recorded is the visible end of something that's already shaping the texture of every chat session and email exchange your operation handles today. We just made it audible.

What this tells us

One recording is an anecdote, not a benchmark. We're not claiming customer-side AI is ready for every kind of service interaction, and we're not benchmarking against other agents. What this exercise tells us is narrower, and we think more useful:

  • The capability is here. Voice quality, conversational handling, identity-check navigation — all at a level where a competent human agent works the call as a normal call.
  • The interaction can be better, not worse. AI on the customer side removed friction we usually attribute to the customer being human — emotional escalation, repetition, talking past each other. That's a useful thing to know.
  • The interaction layer wants a standard. A call where neither side has declared what the AI is authorised to do, what data it can handle, or what should happen on failure works fine until it doesn't. The Service Handshake is our open answer to that — what each party brings to the interaction, written down, before it begins.

What this could look like with the standard on both sides

The customer-side AI in this recording was built on the Service Handshake — goals, authority, data permissions, and fallback rules all declared before it placed the call. The contact centre had no equivalent. It worked the interaction as a routine human-to-human call, because that was the only protocol it had.

If the brand side had also been running a Service Handshake, the same call resolves on different terms: the contact centre knows what it's talking to, agent authority is on the record, the interaction is auditable end to end, and the calmness we observed becomes the design — not a side effect of one party being lucky.

The recording above is what it looks like with the standard on one side. The interesting work is what it looks like with the standard on both.

Worth knowing about timing

EU AI Act transparency obligations come into force August 2026.

Mode 3 / 4 readiness — declared authority, disclosure, fallback handling — maps directly to those obligations. Brands designing for this now will have a tested, documented framework in place before the deadline.

Want to see what your contact centre does with an AI caller?

We can run the same kind of test against your operation — recorded calls, written observations, results shared with you only. Typically four weeks end-to-end, scoped to your scenarios.

Confidential by default. NDA in place from first conversation. Findings shared with you only — never published without your written consent.