Shopping queries dataset: A large-scale ESCI benchmark for improving A9’s product search

Danny McMillan
December 27, 2023

Enhancing Amazon’s A9 with Advanced Semantic Search Capabilities

Make sure you read the articles “Demystifying Semantic Matching: How It Powers Amazon and Beyond”, “The Evolution of Matching Products to Search Queries” and “Improving Seasonal Relevance in Amazon Product Search.” Where I cover BERT, BERT SE and Semantic and Lexical Matching.

TL;DR: Key Takeaways from “Shopping Queries Dataset: A Large-Scale ESCI Benchmark for Improving A9’s Product Search”

1. Enhancement of Amazon’s A9 Algorithm with Semantic Search:

The Shopping Queries Dataset enables Amazon’s A9 algorithm to transcend traditional keyword-based search methods, advancing to more nuanced semantic search. This enhancement allows the algorithm to understand not just the words but the intentions and contexts behind user queries, significantly improving search accuracy and relevance.

2. Evolution of E-commerce Search Engines:

The dataset marks a pivotal advancement in the evolution of e-commerce search engines. It embodies the shift from simplistic, keyword-focused algorithms to complex, AI-driven systems that comprehend user behaviour and the subtleties of language, offering a comprehensive resource to further refine Amazon’s A9.

3. Multilingual and Global E-commerce Strategy:

The dataset’s multilingual nature is critical for Amazon’s global strategy. It ensures that the advancements in the A9 algorithm cater to a diverse customer base, addressing the challenges of adapting to multiple languages, cultural nuances, and local consumer behaviours. This approach enhances user experience and market penetration in non-English-speaking regions, providing Amazon with a competitive edge in global e-commerce.

4. The ESCI Framework for Detailed Product Understanding:

The dataset employs the ESCI (Exact, Substitute, Complement, Irrelevant) framework for annotating query-product pairs. This rigorous approach enriches the A9 algorithm’s understanding of the complex relationships between customer queries and products, allowing for more accurate and relevant search results, and improving product discovery through effective substitute and complement suggestions.

5. Necessity of Optimised Product Attributes for Amazon Brands:

In the context of these advancements, Amazon brands must ensure their product attributes are highly optimised. The A9 algorithm’s enhanced semantic capabilities necessitate clear, detailed, and contextually relevant product information. Optimised attributes improve visibility in competitive markets, aid in accurate product matching, enhance customer experience, and enable effective cross-selling and upselling opportunities. Brands must adapt to evolving search behaviours and leverage data-driven insights to maintain relevance and visibility in Amazon’s dynamic marketplace.

These takeaways underscore the significant impact of the Shopping Queries Dataset on Amazon’s A9 algorithm and highlight the evolving landscape of e-commerce search, where understanding user intent and providing relevant, personalised search results are paramount.

Grab a coffee and let’s get into it!

In the dynamic arena of e-commerce, the prowess of a search engine can make or break the customer experience. At the heart of this realm stands Amazon’s A9 algorithm, an emblem of innovation and efficiency. Recently, a significant stride was made in enriching this search technology – the release of the Shopping Queries Dataset on GitHub via Amazon.Science (link at the end of this article) via Amazon Scholar’s and Applied Scientists . This dataset is not merely a collection of search queries and product information; it is a testament to Amazon’s commitment to revolutionising e-commerce search.

The Shopping Queries Dataset is poised to propel Amazon’s A9 algorithm to new heights of semantic understanding and precision. By transcending traditional keyword-based search mechanisms, this dataset offers a path to a more nuanced and intuitive search experience, aligning closely with the diverse and complex needs of Amazon’s global customer base. It’s a step towards a future where search algorithms comprehend not just words, but intentions, contexts, and the subtle nuances of human language.

The Evolution of E-commerce Search and Amazon’s A9

The journey of e-commerce search engines, particularly Amazon’s A9, has been marked by continuous evolution. From the early days of simplistic, keyword-focused algorithms to the sophisticated, AI-driven engines of today, each phase has brought forward new challenges and breakthroughs. The introduction of the Shopping Queries Dataset represents a pivotal point in this journey.

Initially, Amazon’s A9 algorithm, like many others, grappled with the limitations of keyword-centric searches. However, the advent of AI and machine learning technologies opened new doors. Algorithms began to understand context, user behaviour, and the subtleties of language, leading to more accurate and relevant search results. The Shopping Queries Dataset, with its emphasis on semantic search, is a culmination of these advancements, offering a comprehensive resource to further refine and enhance the capabilities of Amazon’s A9.

In-Depth Overview of the Shopping Queries Dataset

Developed by Amazon Science, the Shopping Queries Dataset emerges as a crucial tool for advancing A9. It encompasses a wide array of search queries in English, Japanese, and Spanish, each paired with up to 40 product results, annotated with detailed ESCI (Exact, Substitute, Complement, Irrelevant) relevance judgments.

Absolutely, expanding on the multilingual nature of the Shopping Queries Dataset and its implications for Amazon’s global marketplaces involves addressing the complexities and challenges of adapting search algorithms to multiple languages and cultures. Here’s an elaboration on this aspect:

The Multilingual Imperative in Amazon’s Global E-commerce Strategy

Amazon, as a global e-commerce leader, operates in a diverse array of international markets, each with its own language and cultural nuances. The multilingual aspect of the Shopping Queries Dataset is crucial in ensuring that advancements in Amazon’s search technology, are not just confined to English-speaking users but are inclusive of a global audience.

Challenges in Multilingual Search Adaptation

Global Presence: Amazon has established marketplaces in numerous countries, including the United States, Canada, various European nations, Japan, India, and Australia, among others.

1. Language Complexity:

Each language has its unique syntax, semantics, idioms, and cultural context, making the adaptation of search algorithms complex. For instance, a direct translation of search terms might not capture the same intent in different languages.

2. Cultural Contextualisation:

Products may have different cultural relevance or usage patterns in different regions, affecting search behaviour. For instance, the term “football” would yield different relevant products in the U.S. (American football) versus Europe (soccer).

3. Local Consumer Behavior:

Search patterns can vary significantly based on local consumer behaviour and preferences. Understanding and integrating these variations is crucial for effective search results.

4. Regional Variations and Dialects:

Even within the same language, regional variations and dialects can impact search terms and product relevance. For example, Spanish spoken in Spain can be quite different from that in Mexico or South America.

Advantages of a Multilingual Approach

1. Enhanced Global User Experience:

By accommodating multiple languages, Amazon can offer a more personalised and effective search experience to its global customer base.

2. Increased Market Penetration:

Effective multilingual search capabilities can enhance Amazon’s penetration in non-English-speaking markets, boosting global sales and brand loyalty.

3. Competitive Advantage:

Excelling in multilingual search capabilities gives Amazon a significant edge over competitors who may not have such advanced, localised search functions.

4. Data-Driven Global Insights:

The diverse data from multilingual searches can provide Amazon with valuable insights into global consumer trends and preferences, informing broader business strategies.

For Amazon, embracing the multilingual nature of e-commerce search is not just a technical challenge but a strategic necessity. The Shopping Queries Dataset, with its multilingual data, is a stepping stone towards creating a truly global search experience that resonates with customers across different languages and cultures. This endeavour involves not only the translation of languages but also the understanding of cultural nuances, regional differences, and local consumer behaviour. As Amazon continues to expand its global footprint, the ability to effectively manage and leverage these multilingual complexities will play a crucial role in maintaining its dominance in the global e-commerce landscape.

The ESCI Framework

At the core of the dataset is the innovative ESCI framework. It categorises products based on their relationship to the search query, going beyond mere keyword matches to include substitutes and complements. This approach aligns with the multifaceted nature of customer searches, where the intent is often more complex than finding a single, exact product.

Hypothetical Scenario: Product Substitute Identification

Keywords and Product Search

Suppose a customer is looking for a “wireless noise-cancelling over-ear headphones” on Amazon. They enter these keywords into the search bar. Ideally, the customer might be interested in a specific brand or model, such as the “Bose QuietComfort 35 II.”

Traditional Search vs. Enhanced A9 Algorithm

– Traditional Search: In a traditional search scenario, if the specific model (Bose QuietComfort 35 II) is out of stock or not available, the search engine might just show similar models from Bose or list unrelated products, leading to a suboptimal shopping experience.

– Enhanced A9 Algorithm with Shopping Queries Dataset: In this enhanced scenario, the A9 algorithm, trained with the Shopping Queries Dataset, recognises the need for substitutes. It understands that the customer is looking for “wireless noise-cancelling over-ear headphones” and not just any Bose product.

Intelligent Substitute Suggestions

– Criteria Recognition: Identifies key criteria from the customer’s query: wireless connectivity, noise cancellation, and over-ear design.

– Substitute Products: Instead of just listing other Bose models, the algorithm intelligently suggests alternatives that meet the identified criteria. For instance, it might recommend the “Sony WH-1000XM4” or the “Sennheiser PXC 550-II,” which are similar in functionality and quality to the Bose QuietComfort 35 II.

– Additional Information: Alongside these suggestions, the algorithm could provide a brief comparison or highlight key features, like battery life or compatibility, to assist the customer in making an informed decision.

Enhanced Shopping Experience

– Customer Satisfaction: The customer, initially looking for a specific model, is presented with relevant alternatives, increasing the likelihood of satisfaction and purchase.

– Brand Discovery: This approach also aids in discovering new brands or models that the customer might not have considered initially.

Scale and Scope

The sheer scale of the dataset is impressive. The reduced version contains over 48,300 unique queries and more than a million judgments, while the larger version expands this to over 130,000 queries and 2.6 million judgments. This extensive range ensures that the A9 algorithm benefits from a diverse and comprehensive set of data, allowing for robust training and testing.

Methodology and Data Collection

The creation of the Shopping Queries Dataset was a meticulous process, aimed specifically at enhancing Amazon’s A9 algorithm.

Data Curation

The data was sourced from a variety of platforms, ensuring a wide range of product categories and consumer intents were represented. This diversity is crucial for an algorithm that powers the world’s largest online retailer.

Annotation Process

The annotation process was rigorous, with each query-product pair being evaluated by experts who assigned one of the four ESCI labels. This attention to detail ensures that the data not only feeds into the algorithm but also enriches its understanding of the complex relationship between queries and products.

In the context of the Shopping Queries Dataset, the annotation process involves assigning one of four ESCI labels to each query-product pair. These labels are crucial in categorising the nature of the relationship between the search query and the product. The four ESCI labels are:

1. Exact (E):

This label is used when the product is an exact match to the query. It signifies that the product directly and precisely corresponds to what the user is searching for. For instance, if a user searches for “iPhone 12 Pro Max 256GB,” an exact match would be that specific model and storage capacity of the iPhone.

2. Substitute (S):

A product is labelled as a substitute when it is not an exact match but can serve as a replacement for the query item. Substitute products typically fulfil the same needs or functions as the queried product but might differ in brand, model, specifications, or features. For example, if a user searches for “Nikon D3500 DSLR Camera,” a Canon EOS Rebel SL3 could be considered a substitute, offering similar functionality in a different brand.

3. Complement (C):

This label is used for products that complement or are typically used in conjunction with the query item. These are not direct replacements but are related in a way that they are often bought together or used alongside each other. For instance, if the search query is for a “PlayStation 5 console,” a complement could be a “PlayStation 5 controller” or “PlayStation 5 games.”

4. Irrelevant (I):

Products that have no meaningful relation or relevance to the search query are labelled as irrelevant. They neither match the query nor serve as substitutes or complements. For example, if the search is for “yoga mat,” a kitchen appliance like a “blender” would be deemed irrelevant.

The rigorous annotation process using these ESCI labels plays a critical role in the dataset. It ensures the data is rich and nuanced, enabling Amazon’s A9 algorithm to understand and interpret the complex relationships between customer queries and products more effectively. This detailed categorisation is key to providing accurate and relevant search results, enhancing the overall shopping experience on Amazon.

Balancing and Comprehensiveness

Ensuring the dataset was balanced and comprehensive was a key focus. The team behind it worked to include a multitude of query types, reflecting the vast array of customer searches on Amazon. This diversity is vital for an algorithm tasked with catering to millions of unique customer needs daily.

Applications and Tasks

The Shopping Queries Dataset is a multi-faceted tool designed to refine and enhance capabilities.

Query-Product Ranking

One of the primary applications is improving the product ranking process. The dataset trains A9 to not just list products based on relevance but to understand the depth and breadth of customer intent, ensuring the most appropriate products top the search results.

Multi-class Product Classification

Another critical application is the classification of products into the ESCI categories. This task enables the A9 algorithm to discern the nature of products in relation to the query, an essential feature for a nuanced and effective search experience.

Product Substitute Identification

Identifying substitutes is a complex but crucial aspect of e-commerce search. The dataset equips the A9 algorithm to intelligently suggest alternatives, enhancing the shopping experience, especially in cases where the exact product is not available.

Baseline Models and Benchmarking

Accompanying the Shopping Queries Dataset are baseline models that serve as a benchmark for the performance of Amazon’s A9.

BERT and Its Variants

The baseline models are primarily based on BERT, a groundbreaking language processing model. By adapting these models to the specific tasks of A9, Amazon can ensure its search engine remains at the cutting edge of semantic search technology.

Setting a Benchmark

These models set a standard against which Amazon can continually measure and improve the A9 algorithm. This benchmarking is crucial in a landscape where customer expectations and technological capabilities are constantly evolving.

Specific Implications for Amazon’s A9 Algorithm

The Shopping Queries Dataset has several direct implications for Amazon’s A9 algorithm.

Enhanced User Intent Understanding

One of the most significant impacts is on the algorithm’s ability to understand user intent. The dataset’s focus on semantic search means the A9 can interpret queries not just at face value but in the context of what the customer is truly seeking.

Improved Search Relevance and Diversity

The dataset also enhances the relevance and diversity of search results. By training on a wide range of queries and product relationships, the A9 algorithm can offer more accurate and varied search results, catering to a broader spectrum of customer needs.

Multilingual Search Optimization

The multilingual nature of the dataset means the A9 algorithm can offer improved search experiences across different languages, a key factor for Amazon’s global audience.

Continuous Improvement and Benchmarking

The dataset provides a means for ongoing assessment and improvement of the A9 algorithm. As the dataset grows and evolves, so too can Amazon’s search technology, ensuring it remains a leader in e-commerce search.

The advancements in Amazon’s A9 algorithm, particularly with the integration of the Shopping Queries Dataset and its emphasis on semantic search, highlight the increasing importance for Amazon brands to ensure their product attributes are highly optimised.

Let’s delve into why this is crucial and how it impacts both visibility and sales.

In-Depth Analysis of the Need for Optimised Product Attributes

1. Enhanced Semantic Understanding of A9 Algorithm

– Contextual Relevance: The A9 algorithm’s improved semantic capabilities mean that it doesn’t just rely on keywords. It also assesses the context and relevance of product attributes to the search query.

– Accurate Matching: Products with well-defined, clear, and relevant attributes are more likely to be accurately matched to nuanced customer queries.

2. Increased Competition on the Platform

– Visibility in Saturated Markets: As Amazon’s marketplace grows increasingly competitive, products with optimised attributes stand out more in search results.

– Differentiation Factor: Detailed and precise attributes can be a key differentiator, helping products rise above competitors in search rankings.

3. Importance of Complete and Detailed Product Listings

– Comprehensive Information: The A9 algorithm favours listings that provide complete and detailed information. This includes not just the title and description but also specifications, features, and high-quality images.

– Enhancing Customer Experience: Detailed listings aid customers in making informed decisions, which is a priority for the A9 algorithm.

4. Role in Substitute and Complementary Product Identification

– Facilitating Accurate Recommendations: Optimised attributes help the A9 algorithm identify and suggest relevant substitutes or complementary products, enhancing cross-selling and upselling opportunities.

– Building a Product Ecosystem: For brands with multiple products, well-optimised attributes can lead to better internal linking within the Amazon ecosystem.

5. Adapting to Evolving Search Behaviours

– Voice Search and Mobile Shopping: With the rise of voice-activated devices and mobile shopping, concise and relevant product attributes are essential for being surfaced in these searches.

– Meeting Diverse Customer Needs: As customer search behaviours evolve, listings need to be optimised to cater to a broader range of queries.

Conclusion: Necessity of Optimised Attributes

In the era of advanced AI-driven search algorithms like Amazon’s A9, the optimization of product attributes is more than a best practice; it’s a necessity. It’s crucial for brands to invest time and resources in meticulously crafting their product listings, ensuring that every aspect, from titles to detailed descriptions and specifications, aligns with how the A9 algorithm interprets and ranks products. This approach not only enhances visibility in a crowded marketplace but also directly contributes to improved customer experiences and, ultimately, to higher sales and brand loyalty.

Sources- https://github.com/amazon-science/esci-data https://www.amazon.science/code-and-datasets/shopping-queries-dataset-a-large-scale-esci-benchmark-for-improving-product-search