Suppose you’re a customer looking for a new laptop on Amazon.
The A9 doesn’t just consider the laptop’s intrinsic properties like sales, reviews, and specifications. It also takes into account the context of your search, such as your location and whether you’re a Prime member or not.
But the devil’s in the details. Because even among laptops with similar specifications, some are more popular among customers – often for no easily quantifiable reason.
These behavioral features, such as popularity and customer preferences, play a pivotal role in determining the ranking of laptops on Amazon search. In fact, they hold more weight than they would in a regular web search.
For this reason, these features account for most of the variance reduction in gradient-boosted trees, leading to more accurate and relevant Amazon search results for customers.
Now, if you’re scratching your head wondering what gradient-boosted trees are, let’s break it down.
They’re a type of machine learning algorithm used to improve the accuracy of predictions in Amazon’s search ranking models. Behavioural features, such as product popularity (approximated by conversion rate), customer status (Prime/non-Prime), and sales (velocity), are used as inputs to these algorithms.
The search results might include hundreds of products with similar descriptions and features. However, some of these products are more popular than others, and have been purchased or viewed more frequently by customers. These behavioural features can be used to rank the products more accurately, by giving more weight to the products that are more popular.
This leads to a reduction in variance—or a more consistent ranking—in the gradient-boosted trees used by Amazon’s search algorithm.
Translation? A better search experience for Amazon’s users – the same goal as most of these “scary-sounding but fundamentally simple” models.
Having just covered the topic of micro credits, you’re probably starting to grasp how Amazon gauges those qualitative factors that make the difference between a technically sound product and a product that shoppers go crazy for.
Price sensitivity, perceived value over actual value, brand recognition (Apple, for instance) – these are all behaviours you can optimise for.
And the lesson merits repeating: While you’re optimising for keywords and copy, don’t forget to consider how you can optimise for customer behaviour.
10 different customers might search for the same product in 10 different ways.
The challenge, and indeed the science, lies in understanding these search queries to recommend the most relevant products.
Let’s take a search for “casual dress”.
The system recognises “dress” as the core product type and “casual” as a modifier. This understanding allows the system to filter out non-dress items like “dress shoes” or “dress socks” from the search results.
It then assigns each product the appropriate product type label, such as “tshirt” for a tshirt, “shirt” for a shirt, and “clothing” for clothing.
These detected product types in the customer’s search query and in the product descriptions are then used to help rank the products more accurately. The system also factors in product brands and common modifiers like colour and gender to ensure accurate matching.
This intricate process is carried out using what’s called probabilistic context-free grammar (PCFG), alongside logistic regression models. Long story short, these tools take messy human language, figure out what it actually means, and use that understanding to make predictions about what the user is looking for.
PCFG is a tool used in natural language processing, specifically the structure of sentences. It’s like a set of rules that helps the system understand how words in a sentence relate to each other. For example, in the sentence “I love red apples”, PCFG helps the system understand that “I” is the subject, “love” is the action, and “red apples” is the object. The “probabilistic” part means that it uses probabilities to make guesses about these relationships, based on patterns it has learned from lots of other sentences.
In practical terms, a PCFG can help in product search by determining the type of product being searched for, based on the words used in the search query. This allows the system to exclude non-relevant products and present the most relevant results to the customer.
This tool is trained using a technique called Variational Bayes and is also influenced by information from the Amazon catalog, like common product labels and known brands.
Logistic regression models, meanwhile, are a type of mathematical model used in statistics and machine learning to make predictions or decisions. In the context of Amazon, logistic regression models are used to predict which products a customer is most likely to be interested in, based on the words they used in their search query.
For example, if someone is on the market and searches Amazon for “waterproof hiking boots,” the logistic regression model would analyze this search query and predict that the customer is likely interested in hiking boots that have a waterproof feature.
Let’s consider another example.
If someone searches for “beard oil”, it’s not immediately clear whether they are open to purchasing any beard oil, or are specifically interested in organic beard oil.
To solve this puzzle, the system uses PCFG to recognize the core product type (in this case “beard oil”) and then analyze any additional information (like “organic”) that might be included in the search.
Once the product type has been recognized, the system then uses this information to label each product appropriately. For example, labeling a product as both “beard oil” and “organic beard oil”. This process is similar to categorizing products and is done using the logistic regression models.
Finally, this information about product types is used to create features that help improve the results of searches and make sure the most relevant products are ranked higher in the search results.
In short, relevance is the linchpin in ranking because it helps determine the most suitable and useful products for a specific query or search. In product search, it’s crucial to differentiate between the product type and a modifier as they play different roles in determining the final search results.
Understanding the difference between product type and modifier helps improve the accuracy of search results, and provides a better user experience by delivering more relevant products.
A great many products fail because brands don’t score their keywords in terms of relevance.
So next up, I’ll be showing you a framework that can significantly enhance your chances of a successful product launch.
By implementing this strategy, you’ll be able to cut down on your PPC spend, make informed decisions before launching a product, and ensure that your genuinely relevant keywords have sufficient volume to justify a launch.
Even among 7-8 figure brands, a common pitfall I’ve observed is to lean on associated product keywords to push up volume in aggregation. This often leads their launch campaigns to fail due to a lack of relevance.
Now let’s get clear on how to avoid this!
Looking for a Better Agency?
Are you a 7 or 8-figure Amazon seller who is…
Before we proceed, please be aware that the following is not part of the scientific literature.
Instead, this framework has been developed by us, using what we know from the literature as a base.
This is not the only possible system. Some parts of it are necessarily arbitrary, and there’s nothing that would stop you from coming up with your own. But we think you’ll get pretty good results from this one.
The following quote from the A9 team will help to set the scene, as it reveals a lot about the language processing that takes place in the back end:
“We treat each query as a noun phrase and consider the head of the noun phrase to be the core product type and all other words in the query to be modifiers.” -The A9 team
It sounds important. And indeed it is. But what does it actually mean?
Let’s break it down with the example of a search for “growth beard oil”.
Firstly, there will be no subcategory entitled “growth beard oil” on Amazon. Everything that’s going on here is happening in the background – it won’t show up in the category ladder that you can physically see in Amazon’s front end.
At this stage, the algorithm is working to identify the core product type, and then the modifiers.
The words that make up the query are checked against the list of product types – which, like customer behaviour itself, is constantly changing as new products are created and the commercial landscape shifts.
In this instance, the words “beard” and “oil” would both be tagged by the A9 as being the product type. Meanwhile, the word “growth” would be tagged as a modifier.
To give you a single-sentence summary of what I’m about to tell you: You want your keywords to be super, super relevant.
To run through how to implement this system, we’re going to use the example search term “plugin air freshener”.
Every keyword relating to a product can be sorted into 1 of 5 categories quantifying its relevance.
Relevance Score: 1 – The irrelevant keywords
Here, we’re talking about the lowest relevance score that you could attribute to the search query, based on the product.
Some examples might be the keywords “carpet freshener” and “car air freshener”.
Logically, a more primitive algorithm, working purely on semantics, might look at these keywords and think ‘well, it says “air” and “freshener” so it must be relevant.’
But common sense clearly tells us that these results are not going to be relevant to someone who is looking for a plugin air freshener. They have nothing in common with the customer’s particular needs.
In this way, we can establish the “lower bound” of relevance for our keyword sets.
Relevance Score: 2 – The indirectly or broadly related keywords
A lot of problems can arise when people go after the keywords in this category.
Continuing with our “plugin air freshener” example, let’s consider the keywords “air freshener” and “air freshener spray”.
Both of these—somewhat counter-intuitively—would be only indirectly or broadly relevant, which puts them well short of the level of relevance we’re looking for in order to execute a profitable launch.
Remember, this is specifically a plugin air freshener; not something that you spray. Using this information, it may be inferred that the majority of unspecified “air fresheners” are also likely to be unsuitable – or the user wouldn’t have added the modifier.
This may seem overly precise, and it’s true that we can be talking about a hair’s breadth of difference. But when it comes to the relationship between the product type and the modifier, a hair’s breadth can become a chasm.
Relevance Score: 3 – The general niche
Next, we turn our attention to the general niche. In this case, the term “plugin air freshener”.
Ah, now this is the good stuff, right? Surely this is as relevant as it gets?
You’d be forgiven for thinking so, but not so fast.
Remember, the plugin air freshener that the shopper ultimately buys will be a single product WITHIN the general niche. They will have chosen it for its specific features, and the way it ticks their particular boxes. That being the case, we can do better.
Relevance Score: 4 – More specific than the general niche
What you’re doing here is niching down into specific differentiators. You take the general niche, and identify relevant terms to combine it with. We’ll call these additional details “specifiers.”
In this case, suppose our plugin air freshener comes with refills, and/or the option to buy additional refills.
We can now look at “plugin air freshener with refills” as a relevant keyword, derived from a distinct identifying feature of the product.
Equally, combining the general niche with two weaker specifiers can yield highly relevant keywords.
Relevance Score: 5 – Very specific to the product
We’re now talking about keywords which are very specific to the product.
Think about a branded keyword like “airwick”. Now, you might have noticed on your travels that a lot of truly stellar conversion rates tend to happen on branded keywords.
The devil, once again, is in the details. As is typical of keywords with formidable-looking conversion rates, there often isn’t enough volume there around these branded searches—unless you’re talking about “apple” and “adidas” and the like—so due diligence is needed, as ever.
We end up, then, with a keyword like “airwick plugin air freshener with refills” after appending our specifiers/modifiers.
It’s helpful to think about the “anatomy” of these 5-point keywords.
General niche: “plugin air freshener”
More specific niche: “plugin air freshener with refills”
Brand name (unique in most cases): “airwick”
Crucially, in the scoring system that we’re going to walk through now, we give the brand name specifier a +2. In a nutshell, it adds more relevance than the other specifiers.
Remember our earlier quote from the A9 team?
“We treat each query as a noun phrase and consider the head of the noun phrase to be the core product type and all other words in the query to be modifiers.”
We can see this all come together if we look at the complete title of our Airwick plugin air freshener:
Air Wick plug in Scented Oil Starter Kit, 2 Warmers + 6 Refills, Lavender & Chamomile, Eco friendly, Essential Oils, Air Freshener
More modifiers than you can shake a stick at. Now, to come back to how these modifiers affect relevance score…
Scoring the relevance of core product types
Here’s where we put our tiered relevance-scoring system into practice.
A core product type of “carpet freshener” gets a relevance score of 1, because as we’ve seen, it falls under the “Irrelevant” category.
The core product type “air freshener” gets a relevance score of 2 as it falls under the “Broad Niche” category, while “plugin air freshener” comes in at a 3 because it fits within the “General Niche” category.
Assessing the overall relevance score of each keyword is as simple as adding up its total relevance points.
Scoring the relevance of modifiers
Now the modifiers enter the picture. Each of these adds a certain amount of relevance to our keyword.
Because it’s so unique, we give the brand name modifier a 2. So “airwick” adds 2 points of relevance to our keyword where the other modifiers add only 1 point.
These 1-point modifiers would include “refill”, “electric”, “home” and “lavender”.
Bringing it together
We now have everything we need to score the relevance of our keywords. Let’s see what this looks like in practice:
Remember, a single keyword might include multiple modifiers—say, “air freshener with refill for home”—so while the search volume’s likely to taper off as the keywords get longer, and we generally consider 5 to be a perfect relevance score, there’s no theoretical maximum.
To circle back to the most foundational point: The A9 works in terms of contextual and behavioural matching, optimising for maximum relevance.
For this reason, scoring the relevance of your search terms helps you to predict—though not to guarantee—their behaviour in PPC and organic ranking.
And higher relevance generally equates to a higher conversion rate. This all feeds into the flywheel, because advertising on higher-conversion keywords is more likely to improve your organic ranking than advertising on lower-conversion keywords (“higher” and “lower” being relative to the top-ranking ASINs in that category).
Hopefully the core message is now intuitively clear: There is far more to the puzzle than the mere popularity of a keyword.
And in a world where even 7 and 8-figure sellers often don’t bother targeting above a “3” in relevance, this information edge is often enough to put you well ahead of the competition.
Earlier, we touched on one caveat that’s worth re-emphasising.
In general, hyper-relevant keywords are likely to have a fraction of the search volume of their less-relevant counterparts. And to some extent that’s just a rule of the game. But nonetheless, intelligently targeting your keywords based on relevance remains a powerful way to ensure that your product launches are profitable.
So as we wrap this up, even if you take nothing else away from this entire article, following this 5-point system before your launches will save you a fortune in Amazon PPC spend.
And with that, you’re now in possession of a tool that we never even intended to make public. We developed this system in-house because it was the most effective way to get results. I hope you have the same success with it that we’ve had.
Ultimately, the science of the A9 is all about moving from feelings to facts to foundations.
We’ll never know it all. But we can be sure of a few core ideas, and once you understand those principles, you can use them to derive further conclusions so that you’re not shooting in the dark.
That way your observations can be better informed, and you’ll have a platform for carrying out tests which are much more likely to benefit your business.
Most of the points we’ve covered today are scientific facts. And the core principles are likely to look the same 10 years from now. But there’s no getting around the fact that certain aspects of Amazon’s workings are unknown by design, and likely to stay that way.
Our goal, therefore, is a realistic one: To formulate not a perfect strategy, but a better strategy – based on the way Amazon actually works, not the way we wish it would work.
And so, having dispensed with the impossible goal of “perfectly” understanding every last little nuance of the machine, it’s my sincere hope that this article leaves you with a much greater understanding of it than you had before.
Take care
Danny & Ben