Authors: Danny McMillan & Oana Padurariu
Over the last few months, sellers, service providers, and key players in the Amazon community have been raving about Cosmo. Finally, the community has woken up, and more and more people are now finally citing from the scientific literature.
In this article we will dive into the research by Changlong Yu and Zheng Li, which outlines the development of the COSMO framework. This framework utilizes large language models (LLMs) to build commonsense knowledge graphs, significantly boosting recommendation performance.
TLDR
- Commonsense Reasoning in Recommendations: The COSMO framework improves recommendation accuracy by incorporating commonsense relationships.
- Data Sources: The framework uses query-purchase and co-purchase data pairs from customer interactions.
- LLM Integration: Large language models generate and refine hypotheses about product relationships.
- Significant Performance Boost: COSMO-based models achieved up to a 60% increase in macro F1 scores compared to baseline models (Classes are the different groups that data can be sorted into, and the F1 score measures how well a model correctly identifies and categorizes them.).
- Human and Machine Collaboration: A combination of machine learning models and human annotations filters and enhances the quality of the knowledge graph.
Introduction
As Amazon (as market leaders) aim to provide the best shopping experience, product recommendation systems play a pivotal role. A customer looking for “shoes for pregnant women” expects the recommendation system to suggest slip-resistant shoes, understanding the underlying commonsense reasoning. This is where the COSMO framework steps in, leveraging large language models to build a commonsense knowledge graph that encodes relationships between products and their real-world contexts.
Key Findings
1. Commonsense Reasoning in Recommendations
COSMO uses commonsense reasoning to improve product recommendations. By understanding implicit relationships, such as linking slip-resistant shoes to pregnant women, the recommendation system becomes more intuitive and effective. This approach helps in anticipating customer needs based on everyday knowledge, which is not explicitly stated in the data.
2. Data Sources for Knowledge Graphs
The COSMO framework relies on two primary data sources:
- Query-Purchase Pairs: These combine customer queries with purchases made shortly after.
- Co-Purchase Pairs: These involve products bought together during the same shopping session.
Initial data pruning removes noise, ensuring the remaining data accurately represents meaningful relationships.
In order to reduce irrelevant query-product pairs and random co-buy pairs, Amazon implemented additional measures after noticing a significantly low typicality ratio for co-buy data pairs. This low ratio occurs because large language models (LLMs) tend to generate intention knowledge for just one of the co-purchased products, rather than considering the common reasons behind the joint purchase. This approach often leads to implausible generations, necessitating a more refined method to improve the relevance and accuracy of co-purchase predictions.
In order to address this issue and have a more robust model, we see Amazon collected data from 18 Domains (categories), 15 Relations (like used_for_function, used_for_event, is_a, etc) and 5 Tasks.
By utilizing domain and relation classifications, Amazon can more effectively identify irrelevant query-product pairs or random co-buy pairs. This categorization allows for more nuanced filtering, retaining valuable data while eliminating noise that could skew recommendations or search results. Covering 18 product domains, 15 relation types and 5 tasks, Amazon creates a rich, diverse dataset for instruction tuning of the COSMO Language Model. This diversity allows the model to learn nuanced differences between product categories and relationship types, leading to more accurate and contextually appropriate outputs.
3. Integration of Large Language Models
Large language models (LLMs) are central to the COSMO framework. They generate hypotheses about product relationships from the data pairs. These hypotheses undergo a rigorous filtering process using human annotations and machine learning models to ensure high quality. The refined hypotheses are then used to create detailed relationship descriptors, enhancing the knowledge graph.
4. Significant Performance Boost
The performance of COSMO-enhanced models was evaluated using the Shopping Queries Data Set from the KDD Cup 2022. The results were remarkable:
- The COSMO-based model achieved a 60% increase in macro F1 score compared to the best baseline model.
- Even after fine-tuning on a subset of the data, the COSMO model maintained a substantial performance edge, with a 28% improvement in macro F1 and a 22% improvement in micro F1 scores.
These improvements highlight the effectiveness of incorporating commonsense knowledge into recommendation engines.
5. Human and Machine Collaboration
COSMO’s success lies in the collaboration between machine learning models and human expertise. Human annotators assess the plausibility and typicality of LLM-generated hypotheses. This combined approach ensures the knowledge graph reflects realistic and commonly understood relationships, enhancing the recommendation system’s accuracy.
6. Search Navigation
COSMO goes beyond traditional e-commerce by transforming how shoppers search and find products. Instead of using rigid product categories, it focuses on customer needs and intentions, making shopping more intuitive by matching how customers think and speak about what they want. The system’s key feature, Multi-Turn Navigation, allows for a natural, conversational search process. For instance, a search for “camping” might lead to “air mattress,” then “camping air mattress,” and finally to specific options for lakeside, mountain, or group camping. This approach not only personalizes the shopping experience but also enables deeper, more precise product discovery, significantly improving how users find what they’re looking for.
The COSMO Framework
The COSMO framework involves several steps to build and refine the commonsense knowledge graph:
- Hypothesis Generation: LLMs generate hypotheses about relationships between products based on query-purchase and co-purchase data.
- Filtering Low-Quality Hypotheses: A combination of human annotations and machine learning models filters out low-quality hypotheses.
- Refinement and Annotation: Human reviewers extract guiding principles from high-quality hypotheses, which are then used to prompt the LLM.
- Training Classifiers: Machine-learning-based classifiers assign scores to the remaining hypotheses, retaining only the most plausible and typical relationships.
- Knowledge Graph Construction: The final set of relationships is encoded into a knowledge graph, which is used to enhance the recommendation system.
Example: How You Can Use COSMO to Optimize Your Listing
When selling a product on Amazon, it’s crucial to optimize your product listings with contextual words that resonate with your target audience. This guide will help you, as a brand owner and marketer, use the COSMO framework to create effective and appealing product listings. Yes, keywords and indexing still matter, but search has moved to a whole new realm as algorithms have become more sophisticated—you have to refine and adapt with the technology. In this example, we will use beard oil.
Step 1: Understand Your Product and Audience
Identify Key Features and Benefits:
- 100% organic ingredients
- Moisturizes and softens beard hair
- Promotes healthy beard growth
- Suitable for all skin types
- Natural scent
Understand Your Audience:
- Men with beards who prefer natural and organic products
- Individuals looking for high-quality grooming products
- Those interested in maintaining beard health and appearance
- Women searching for a sustainable, organic, or natural product to give as a gift to a man
Actionable Tip: Write down a list of your product’s key features and benefits, and think about who your ideal customer is. This will help you create relevant and appealing content that will speak directly to their needs.
Step 2: Gather Data Sources
Query-Purchase Pairs:
- Example Query: “best organic beard oil”
- Example Purchase: Your organic beard oil
Co-Purchase Pairs:
- Products often bought together: Beard combs, beard shampoo, beard balm
Actionable Tip: Analyze your customer data to identify common search queries and products that are frequently bought together with your beard oil. This will help you understand what customers are looking for and how they use your product. Extract data from Brand Analytics to understand your customers’ purchasing behavior. Analyze insights from the Market Basket Analysis to identify products frequently bought together and use these findings to uncover effective bundling strategies and cross-sell opportunities based on customer buying patterns.
Log in to your Seller Central account -> Access Brand from the main menu -> Brand Analytics -> Consumer Behaviour Analytics – > Market Basket Analysis
Step 3: Use ChatGPT to Prompt Ideas
Use prompts generate potential relationships and contexts for your product based on the data you’ve gathered.
Example Hypotheses:
- Customers looking for “best organic beard oil” might value high-quality, natural ingredients.
- Beard oil buyers often also purchase grooming tools, indicating an interest in comprehensive beard care.
- Common attributes: “hydrating”, “softening”, “nourishing”, “natural scent”
Step 4: Refine Contextual Words
Filter and refine the hypotheses to identify the most relevant and effective contextual words.
Filtered Contextual Words:
- Hydrating
- Softening
- Nourishing
- Natural scent
- Promotes healthy beard growth
- Suitable for sensitive skin
- Ideal for daily use
Actionable Tip: Choose the words that best describe your product and match the needs and preferences of your target audience. These words should be incorporated into your product listings.
Action Plan for Listing Optimization
- Brainstorm Contextual Words:
Write down all features, benefits, ingredients and uses of your beard oil.
Think about the needs and preferences of your target audience.
- Analyze Customer Data:
Look at common search queries and co-purchases.
Identify trends and patterns in customer behavior.
Actionable Tip: To get a clear picture of your customers and their behavior, you’ll want to dive into the data from Brand Analytics and mix it with insights from other tools you might be using. Take a close look at how people are searching for your products. This can reveal a lot about market trends and how shoppers’ preferences are changing. It’s also helpful to see how you stack up against the market averages.
Monitor the data on a weekly basis and compare yourself to market averages on the most relevant terminology.
Filter all relevant terminology and check for popularity shifts as well as emerging trends.
Log in to your Seller Central account -> Access Brand from the main menu -> Brand Analytics -> Search Analytics – > Search Query Performance
Explore the demographics section too as it provides critical insights into your actual customer base. By synthesizing this information with other analytics, you’ll develop a comprehensive understanding of your market and the profile of your customers – age, gender, marital status, etc.
Log in to your Seller Central account -> Access Brand from the main menu -> Brand Analytics -> Consumer Behaviour Analytics – > Demographics
Generate and Refine Hypotheses:
Use an LLM tool to create hypotheses about product relationships.
Filter and refine these to select the most relevant contextual words.
- Integrate Contextual Words:
Update your product title, description, bullet points and backend product attributes with the refined words. Add the most relevant words on your secondary images, A+ /Premium A+ and alt text in order to help the algorithm better understand and index your product for all relevant searches.
Make sure all product attributes are accurately added in the backend. This allows the search function to efficiently identify and match customer queries to your products.
The Critical Impact of Product Attributes on Success
Product attributes play a vital role in the success of your Amazon listings by enhancing search visibility and discoverability. Amazon’s algorithm relies on these attributes to accurately index and categorize products, ensuring they appear in relevant search results and category pages. Since attributes vary by category and browsing node, it’s crucial to review all available columns and add the necessary information. This not only boosts your product’s visibility but also helps it reach the right audience, ultimately driving better sales performance.
Free Bot
To make this process easier, you can use the chat bot below to check for all required attributes. Just provide the link to the product detail page (PDP) and the subcategory, and you’ll get a list of all the required attributes for your product specifically.
Link to chat bot:
https://chatgpt.com/g/g-pstSvyAfG-product-back-end-attributes-required
Video showing how it works:
Amazon.com : Native Deodorant Contains Naturally Derived Ingredients, 72 Hour Odor Control | Deodorant for Women and Men, Aluminum Free with Baking Soda, Coconut Oil and Shea Butter | Coconut & Vanilla : Beauty & Personal Care – 8 August 2024
To fill in the back-end product attributes, you have two options:
1. Individual Listing Method: Go to each listing -> Click “Edit listing”->Open the page -> Navigate to the “Product Details” section
2. Bulk Update Method:Download a product category report for the specific category and subcategory -> Follow the example for the required category using the bulk file provided by Amazon
Example of product attributes required in the Grocery & Gourmet Food, Candy & Chocolate browsing node.
Ensure your listing is clear, engaging, and informative, but most importantly, contains only relevant information.
- Monitor and Adjust:
Track the performance of your updated listings.
Make adjustments based on customer feedback and sales data.
Conclusion
By using the COSMO framework to integrate contextual words into your product listings, you can significantly enhance their effectiveness. This approach helps ensure that your listings resonate with potential customers, leading to improved visibility and sales. Follow this guide to optimize your organic beard oil listings and see the difference it can make in your e-commerce success.