Predicting Answers to Product Questions Using Similar Products

Danny McMillan
November 1, 2023

The Problem

Many online shopping sites allow customers to ask questions about products, which other customers then answer based on their experiences. This helps shoppers make informed decisions. However, new or uncommon products often have few or no answered questions.

The Proposed Solution

To address this problem, researchers propose an automated way to predict answers for these unresolved questions.

The main idea is to:

Find similar questions already answered for similar products
Aggregate those answers to make a prediction

How it Works

The proposed method has 4 main steps:

1. Retrieve Potentially Relevant Questions

Given a new question about a product, the system first searches through a large database of previously asked and answered questions about other products.

It uses a fast AI technique to quickly retrieve hundreds of questions that might be semantically similar to the new question.

2. Filter for Highly Similar Questions

Next, each retrieved question is scored for similarity against the new question using a more sophisticated but slower neural network model.

Only highly similar questions exceeding a threshold are kept. These are considered “twin” questions likely expressing the same intent as the new question.

3. Estimate Contextual Product Similarity

Now the new product and question pair is compared to each twin product and question pair using a custom model called contextual product similarity (CPS).

CPS estimates if two products are similar in the context of a specific question.

For example, two jackets may be considered similar for questions about waterproofing but not about sleeve length.

4. Predict Answer by Combining Similar Answers

Finally, the answers from the highly similar twin questions are aggregated, with each answer weighted by the CPS score.

This allows predicting the most likely answer for the new question.

Key Innovations

The CPS model for estimating if two products are similar or not for a specific question.
A clever automatic way to train CPS without needing humans to manually label product similarities.

Results

The method was tested on yes/no shopping questions from 11 Amazon categories.

For questions with at least 10 similar comparison questions, it predicted the correct yes/no answer 79.5% of the time, significantly better than always guessing the most frequent class.

Showing the similar evidence questions helps users interpret the system’s answers.

Summary

This paper introduces a new way to predict answers for new product questions (known as the cold-start problem) by transferring knowledge from similar questions already answered about similar products.

A major benefit is providing answers even for uncommon products without many existing answers.

The CPS model and automatic training data approach are innovative.

While imperfect, showing the similar evidence questions allows users to potentially correct inaccurate predictions themselves.

This demonstrates how resolving cold-start issues in collaborative systems can be done by transferring knowledge from similar content.

The contextual similarity modeling could have applications beyond product questions as well.