AI Mode. The architectural shift behind conversational search

In Welcome, AI Mode, we introduced Empathy’s conversational search as a new layer within Empathy Platform; one that complements the current keyword and hybrid search offering.

This second part focuses on what makes AI Mode structurally different from other AI-based search solutions. Because here's the thing: the difference is not about the interface. It’s about the underlying architecture.

Let’s start with the basics.

AI-based search: wrapper or infrastructure?

Most AI-based search solutions today work the same way: a search platform integrates with a third-party LLM API, such as OpenAI, Google Vertex, or Anthropic. Queries and catalog data are sent externally. A response is returned and the UI displays it.

It’s simple, and that simplicity has real appeal. The barrier to experimentation is low. But this approach also creates structural dependencies that are harder to see upfront: external data processing, variable token-based pricing, vendor-driven roadmap decisions, layered compliance responsibilities, and limited control over how models evolve.

AI Mode

AI Mode works differently. Conversational search runs on Empathy’s managed private cloud infrastructure, using self-hosted GPU clusters and open-weight language models. These models are selected and adapted based on specific AI tasks—some optimized for conversational text generation, others for product ranking and retrieval—rather than relying on a single general-purpose model for all commerce search needs. In this approach, AI is not an external add-on or an API wrapper. It’s part of the search platform itself. It’s infrastructure. And this architectural choice affects everything else.

Data sovereignty

With AI Mode, inference happens inside Empathy’s controlled architecture. In practice, this means that customer queries, catalog data, and interaction signals are processed within a private cloud perimeter operated by Empathy. They are not routed to external LLM endpoints. They don’t get exposed to third-party model training pipelines.

You get privacy by design, not privacy by configuration.

Simplified compliance

When your AI stack relies on multiple external providers, compliance responsibilities get messy very fast. You need multiple subprocessors, layered contracts, and numerous audits. Each additional processor adds technical, contractual, and audit complexity.

Since AI Mode runs within Empathy’s infrastructure, the regulatory boundaries are clearer. Retailers work under a single Data Processing Agreement with Empathy Holdings. Processing happens within EU-based facilities. No cross-border transfers to external, US-based LLM endpoints. Data deletion processes stay under direct control, in line with GDPR.

The benefit is not just compliance; it’s structural simplicity.

Service availability without vendor dependency

External AI APIs can change their pricing models, rate limits, terms of service, or model availability. And those changes happen independently of the search platform that integrates them. You find out when everyone else does.

By running on self-hosted infrastructure with open-weight models (including OLMO3 and other open architectures), AI Mode sidesteps these dependencies. Your contractual relationship stays between you and Empathy. There’s no external vendor who can retroactively alter contractual conditions. If upstream providers experience outages, AI Mode keeps functioning because inference does not rely on their endpoints.

Service availability—and therefore platform reliability—becomes something you can count on, not something you hope for. It’s built into the architecture.

Predictable economics, not per-token pricing

E-commerce traffic is not linear. Campaigns, seasonality, and peak events create dramatic fluctuations in query volume.

Most API-based AI scales per token or per request. AI Mode operates within a managed subscription model instead. Infrastructure scaling happens within Empathy’s private cloud, not through external per-token passthrough.

This does not eliminate cost, but it does reduce volatility and external exposure. AI becomes part of the platform’s economics rather than an external, token-based service layered on top.

AI that respects merchandising control

There’s no doubt that conversational interfaces can generate fluent responses. That’s not the challenge anymore. The challenge is making sure they stay aligned with business logic. Because AI Mode operates inside the search platform, it connects structurally to the Playboard’s merchandising controls. Retailers can influence tone, context, exclusions, ranking rules, and strategies. Fine-tuning happens within the private cloud environment, using retailer-specific configurations and catalog data.

This turns conversational AI from a generic assistant into a governed extension of search relevance: It’s not just AI that talks. It’s AI that participates in the store’s strategy.

Inference and performance

AI Mode inference runs within the same controlled infrastructure as search indexing and retrieval systems, meaning the language models, product catalog, and search data all operate within the same network environment. This architectural proximity reduces cross-continental API calls, external network hops, and latency unpredictability.

Unlike multi-tenant public cloud deployments, where your workloads share physical infrastructure with unrelated services, Empathy’s AI environment is dedicated to Empathy customers only.

For conversational search experiences, this infrastructure proximity matters. Response time depends on the infrastructure the platform controls, not on external API availability or shared vendor capacity.

Open models as a structural choice

Closed proprietary models create long-term dependency risks: limited inspectability, opaque data-training processes, licensing uncertainty, and vendor-controlled evolution.

Empathy integrates open-weight models, such as OLMO3, into its private cloud architecture. This gives you greater transparency into how models are deployed and how they evolve over time.

Choosing open models is not ideological. It's structural risk management. It’s a way to reduce strategic exposure while maintaining control over how conversational AI evolves inside the platform.

Innovation at retailer pace

When conversational AI runs through external API providers, new features arrive on their timeline. Want to add multimodal search? Improve ranking logic? Integrate new data sources? You're waiting for your vendor's product roadmap.

When AI infrastructure is internal, capabilities development happens within the same controlled environment as the rest of the search platform. Retrieval strategies, reranking logic, multimodal extensions, and AI-powered merchandising tools can be developed and deployed based on retailer needs and Empathy's roadmap, not external vendor priorities.

Innovation happens at your pace, not your API vendor’s pace.

A conversational interface with architectural advantage

The conversational search interface is what people see in AI Mode. Underneath sits a private cloud environment, open-weight model deployment, integrated merchandising governance mechanisms, controlled compliance boundaries, and predictable economic models.

These elements are not marketing differentiators. They are architectural decisions that affect how the system behaves.

AI in e-commerce determines which products are shown, how revenue is distributed, how brands are represented, and how data responsibilities are managed. Treating it as a wrapper over external APIs introduces fragility. Treating it as infrastructure introduces control and responsibility.

That’s the assumption AI Mode is built on.

Demo AI Mode!

Ready to explore what private cloud AI search means for your business? Contact us at info@empathy.co to schedule a demo.

AI Mode is powered by Empathy.AI, which provides infrastructure for human-centered, sustainable, and ethical AI. Visit EMPATHY.AI (opens new window) to learn how we make AI easier to own, understand, and control.