2 min read

Vector and keyword search as a unified index

The latest advances in semantic models have highlighted the importance of vector search as the solution for more relevant and faster search experiences in commerce search. In the path to hybrid search, merging a vector search approach with a traditional keyword-based solution to have a unified index is a challenging task, as conflicts can arise when both mechanisms coexist.

interact

Understand how Empathy Platform approaches semantic model training on top of data privacy and consent integrity on the Semantic models in vector search page.

Vector and keyword matches as a unified index

As keyword-based searches rely on exact matches between the entered terms and the indexed product data, while vector search uses numerical representations to create associations between queries and catalogue’s products, some adjustments need to be made to ensure that both approaches evolve together and complement each other.

Implementing vector and keyword search techniques side-by-side requires addressing the following issues:

  • Privacy should be the foundation for designing semantic relations to avoid the system to become a black box solution.

  • A bridge between how both techniques handle search needs to be built for search results to be aligned and displayed efficiently.

  • Transparency should be key so that brands and merchants can understand the inner workings of search through Explainability.

  • Shoppers need to interact with returned search results by applying filters, aiming for actionable elements to rely on all indexed items, not only semantic suggestions.

  • Vectorized search results should be displayed differently to how keyword-based results are shown until they can be effectively merged and coexist, as vector proximity calculations return fewer pages and products than keyword-based ones, which can disrupt user expectations.

Technical solutions for hybrid search: horizontal indexing

Having in mind the concerns and issues of developing a hybrid search solution in Empathy Platform, the first step to combine vector-based and keyword-based search is through horizontal indexing. Scaling the number of jobs that are processed helps avoid bottlenecks, and expanding the limitations for the number of running jobs also solves dependency issues.

Besides, semantic models can be used to enrich indexing and searching features, ensuring stability in keyword searches as new semantic models are created to combine both relevancy models. As established by the Empathy Platform's indexing pipeline, the vector and keyword data sources can merge at index time.

interact

To see how Empathy Platform is approaching vector search in real situations to enrich indexing and searching features in the journey to hybrid search, check out Experience vector search in Empathy Platform.

Horizontal indexing for ad hoc search capabilities

Horizontal indexing refers to the use of multiple pipelines in parallel for faster search response times, allowing for an integrated keyword and vector search response. But index horizontal scaling also enables specific flows for other search elements and injectors, such as monetized products. These pipelines can enrich the product catalogue with more ranking signals, leading to higher search relevance. To leverage ad hoc horizontal indexing blended with vector search, check out the monetization indexing use case.