Operate self-hosted AI Infrastructure

3 min read

Empathy Platform's AI infrastructure prioritizes sovereignty, sustainability, and scalability, powering advanced search and discovery through private, in-house, high-performance computing that ensures data privacy and operational resilience. Self-hosted AI is designed for organizations seeking commerce-ready intelligence with full control over deployment, embedding capabilities like Related Prompts, AI Carousels, or AI Mode directly into their cloud or data center, all aligned with security, governance, and compliance needs.

Running AI in owned environments lets infrastructure decisions match your strategy, legal requirements, and sustainability goals, with clear visibility into processing, services, and scaling, unlike opaque third-party stacks. It becomes a long-term investment, evolving models and workloads without provider changes or license renegotiations.

Empathy's self-hosted AI delivers sovereign intelligence where you control the stack, not the other way around.

Empathy Platform built for self-hosted AI

Empathy's choice is fully in-house AI infrastructure on dedicated private clouds, deployable on-premise or in commercial solutions, ensuring control over data, code, and operations while matching hyperscale performance. Core AI services run on dedicated hardware in Empathy's net-zero bioclimatic headquarters building, minimizing latency for real-time commerce without external dependencies.

Cloud-agnostic by design, the platform supports managed self-hosted modes, seamlessly integrating with your stack. AI workloads, such as conversational search and GenAI agents, leverage your compute, storage, and networking resources. Empathy teams assist with integration, CI/CD, monitoring, and recovery, providing reference architectures while you own decisions, turning self-hosted AI into a stable platform component.

Extensible microservices with plugins avoid lock-in, offering single/multi-tenant options that scale from startups to enterprises with guaranteed uptime, free from outages.

Self-hosted hardware and performance

Built around NVIDIA DGX Spark supercomputers and L40 GPU servers, this enables sophisticated workloads like conversational search, RAG with domain datasets, and model refinement, without API dependencies. Each DGX Spark handles 128 isolated workspaces, saving €130K+ annually versus cloud licensing and empowering rapid iteration.

Dedicated hardware ensures predictable latency for search, recommendations, and agents, colocated with your stack to cut network hops and bottlenecks. Costs shift from OpEx subscriptions to reusable CapEx, with hosting, discovery, analytics, and tools on a single system, allowing interanlscaling on your own terms.

Open-weight models and sovereignty

Empathy uses transparent, open-weight models like OLMO3, deployed onsite for privacy, ensuring no data leaves your control. Fully inspectable and adaptable, they avoid black boxes, supporting ethical practices and merchant customization.

Models run in private environments, are auditable for behavior, and are fine-tunable to domains. Retailers set tuning data use, retention, and guardrails per governance. Domain-specific datasets from anonymous behavior deliver relevance without compromising consent or privacy.

Sustainability and self-hosted AI

Self-hosted AI ties to sustainability: Empathy's private cloud for AI agents runs in a net-zero bioclimatic building powered by solar energy, using dedicated GPUs and local processing to lower its footprint compared to hyperscalers.

Extend this by choosing renewables, batch optimization, and local-first compute. This approach aligns with Empathy's ethics, which emphasize privacy as a right without personal data storage, while balancing innovation, independence, responsibility, and trust.

AI already running in a self-hosted infrastructure

Start with use cases like Related Prompts and AI Carousels on private cloud, ensuring data stays controlled, then expand to conversational AI Mode, merchandiser tools, or workflows. Choose self-managed (your teams' own infra) or collaborative models, with Empathy expertise for scaling and compliance. The result: stable, sovereign AI avoiding lock-in and outages.

Truman, a self-hosted conversational AI example

Try a real-world example that runs entirely in a private environment using Empathy's self-hosted infrastructure: Truman (opens new window), Casa del Libro's AI-powered book recommender. It helps shoppers discover books through natural conversations, recommending titles based on preferences and context—without sending data to external providers.