2 min read

Semantic and keyword search as a unified index

The latest advances in semantic models have highlighted the importance of semantic search as the solution for more relevant and faster search experiences in commerce search. In the path to hybrid search, merging a semantic search approach with a traditional keyword-based solution to have a unified index is a challenging task, as conflicts can arise when both mechanisms coexist.


Understand how Empathy Platform approaches semantic model training on top of data privacy and consent integrity on the Semantic models in semantic search page.

Semantic and keyword matches as a unified index

As keyword-based searches rely on exact matches between the entered terms and the indexed product data, while semantic search uses numerical representations called vector embeddings to create associations between queries and catalogue’s products, some adjustments need to be made to ensure that both approaches evolve together and complement each other.

Implementing semantic and keyword search techniques side-by-side requires addressing the following issues:

  • Privacy should be the foundation for designing semantic relations to avoid the system to become a black box solution.

  • A bridge between how both techniques handle search needs to be built for search results to be aligned and displayed efficiently.

  • Transparency should be key so that brands and merchants can understand the inner workings of search through explainability.

  • Shoppers need to interact with returned search results by applying filters, aiming for actionable elements to rely on all indexed items, not only semantic suggestions.

  • Semantic search results should be displayed differently to how keyword-based results are shown until they can be effectively merged and coexist, as vector proximity calculations return fewer pages and products than keyword-based ones, which can disrupt user expectations.

Technical solutions for hybrid search: horizontal indexing

Having in mind the concerns and issues of developing a hybrid search solution in Empathy Platform, the first step to combine semantic-based and keyword-based search is through horizontal indexing. Scaling the number of jobs that are processed helps avoid bottlenecks, and expanding the limitations for the number of running jobs also solves dependency issues.

Besides, semantic models can be used to enrich indexing and searching features, ensuring stability in keyword searches as new semantic models are created to combine both relevance models. As established by the Empathy Platform's indexing pipeline, the semantic and keyword data sources can merge at index time.


To see how Empathy Platform is approaching semantic search in real situations to enrich indexing and searching features in the journey to hybrid search, check out Experience AI-powered semantic search in Empathy Platform.

Horizontal indexing for ad hoc search capabilities

Horizontal indexing refers to the use of multiple pipelines in parallel for faster search response times, allowing for an integrated keyword and semantic search response. But index horizontal scaling also enables specific flows for other search elements and injectors, such as monetized products. These pipelines can enrich the product catalogue with more ranking signals, leading to higher search relevance. To leverage ad hoc horizontal indexing blended with semantic search, check out the monetization indexing use case.