AMAZON A9 ALGORITHM

But what is the Amazon A10 Algorithm?

Well, we’ve already covered the short version: There’s no such thing.

But since you’re reading this section, you’d probably like to hear the longer version.

So here it is.

A long-standing myth

The A10 is up there with the Loch Ness Monster. A lot of talk about it; not so much concrete evidence… but a topic of enduring fascination nonetheless. 

In part, we can chalk the rumours up to a general lack of knowledge. Equally, there’s always going to be a certain amount of clickbait circulating around the Amazon ecosystem.

By understanding why the myth has been kicking around for so long—and indeed, how we can be so sure that it is a myth—we can gain a stronger grasp of some of the principles that make the A9 tick. 

Algorithms are not static entities

Firstly, it’s essential to understand that algorithms don’t stay the same. They change and evolve from year to year; month to month; in some instances even day to day, adapting to changes in data and user needs. This is particularly true for search algorithms like the A9 on Amazon, which must constantly adjust to shifts in user behaviour, preferences, and the ever-changing landscape of products sold on Amazon.

The A9 is not a “one and done” algorithm that stands unchanging in the face of time. Amazon has made innumerable updates to the A9 algorithm, introducing significant changes like altering the way product reviews are displayed and introducing advertising options. These changes are far from just cosmetic; they are integral to the algo’s evolution.

Digging deeper, we find a history of incremental changes to the A9. Take the Pay-Per-Click (PPC) example – gone are the days when you could skyrocket to high rankings with a poor conversion rate by merely throwing money at PPC. Now, a poor conversion rate can negatively impact both your organic rank and your PPC. 

These changes have all required updates to the A9, so they serve as evidence of its ongoing refinement.

If it ain’t broke… 

Given this history of iterative improvement, it’s far more likely that Amazon will continue to refine the A9 rather than scrap it for a shiny new A10. It’s simply more efficient to build on existing knowledge and expertise than to reinvent the wheel.

Moreover, replacing the A9 with an A10 would pose significant infrastructure-level risks. The A9 is a complex system, tightly woven into the fabric of Amazon. Replacing it would require extensive testing and validation to ensure the new algorithms are performing correctly – a process that would be both time-consuming and resource-intensive.

Furthermore, a new algorithm would take time to ramp up and produce better results than the existing one. After all, why go through the trouble of creating a brand new algorithm if it doesn’t outperform the old one? 

This ramp-up period could—and most likely would—impact the user experience and, by extension, Amazon’s revenue, leaving shareholders unhappy and potentially impacting the share price. And it seems safe to assume that Andy Jassy doesn’t want that on his resume.

As a final reflection, dismantling its most important algorithm is simply not a move that aligns with Amazon’s overall approach to its business. From day 1 (which, as we all know, is every day) the company has been characterized by long-term planning and investment. Capricious swings from one approach to another are not in its DNA.

So while we can’t definitively rule out a new algorithm in the future, it seems much more likely that Amazon will continue the process of incremental improvement that has made it one of the most valuable companies in the world.

So why is it called the A9?

But wait, I hear you say. Wasn’t there an A1? And an A2, and an A3, and… ?

You’d be forgiven for asking. But no, the A9 is the only algorithm Amazon has ever used.

And as for the name… how many letters are in “algorithm”?

Yep. That’s really why they called it that.

databrill logo

Looking for a Better Agency?

Are you a 7 or 8-figure Amazon seller who is…

Table of Contents

DEEP DIVE:

How Ranking Models Are Trained to Index Products?

Imagine that a company sells kitchen appliances, like blenders, toasters and mixers, on Amazon. 

Bringing together a few of the threads we’ve discussed, here’s how the ranking models are trained to serve the most relevant results for the category:

  • The Amazon A9 team collects data from customer traffic for several days to construct training, validation, and test sets.

  • The training and validation sets are used to train the ranking model for the kitchen appliances category on Amazon. Meanwhile, the test set is used to assess the effectiveness of the model. In scientific terms, it acts as the control variable.

  • In the data collection on Amazon, impressions that resulted in either a click or purchase for the products are considered positive examples.

  • Post-search activity is also crucial. When a user searches and cannot find the product they want under that search query, the Amazon A9 captures this data in a behavioral match within the set. This behavioral information can then be used for future searches on that search query within that category.
 
 

On the face of it, you might imagine that this system would lead to self-reinforcing biases. After all, the products at the top of the Product SERPs get by far the most clicks – so doesn’t this lead to an impossible battle for those further down?

Thankfully, it doesn’t quite work that way. The ranking model is designed to adapt its bias correction from day to day based on customer behavior, so if lower-ranking products start growing in popularity, Amazon will take that relative gain into account and give that product a well-earned ranking boost.

Unsurprisingly, the disproportionate share of clicks has turned top-of-search positions into highly coveted real estate. It’s equally unsurprising that this territory is dominated by paid ads, which have become a major revenue stream for Amazon over the last few years.

Feature selection on Amazon

As we know, customer engagement data isn’t the only plentiful resource the A9 has to work with. You’ll remember from the section on feature vectors that it also has a boatload of data about the features of each product.

And this presents a different problem altogether. Because when you have a plethora of different features, how do you figure out which of those to use in determining a product’s relevance?

The technical challenge becomes clear when you look at the number of features that can influence the search results. For a given search, this might include:

  • Product name
  • Brand
  • Product description
  • Product images
  • Reviews
  • Star ratings
  • Price
  • Conversion rate
  • Availability
  • Category
 
 

So what does the A9 Algorithm do?

This problem is solved by a model for feature selection – the process of narrowing it down to the features that the Amazon A9 is going to use.

First up, Amazon will run through the list of features at its disposal, and might find that a handful of them have a significant impact on the search results. So, they might keep those features and remove the rest.

Next, Amazon might use backward elimination or forward selection to narrow the list down to the most important features for that search. 

For example, they might that the 4 most important features are:

  • Reviews
  • Star ratings
  • Price
  • Conversion rate
 

These 4 features will now be used to present search results to users.

For example, a user might see:

  • Products with high star ratings and positive reviews on Amazon.
  • Products within a certain price range
  • Products with high conversion rates, indicating that they are popular among buyers
 

This process allows Amazon to highlight the features that are the most important to their users. As with most of these things, the net result is a better search experience.

The algorithm employs both backward elimination and forward selection in narrowing down these lists. These lofty-sounding approaches are really 2 sides of the same simple coin. Let’s use the same 4 features above to illustrate the point.

In backward elimination, you start with all 4 features. So we have reviews, star ratings, price and conversion rate. You then remove the least important feature, based on a criterion which is decided by the A9 team. There are various methodologies for deciding which is least important—e.g. information gain or a chi-squared test—but it boils down to determining which feature has the least valuable impact on the search results. Depending on how far you want to narrow it down, you then repeat this process until only the most important features remain. Pretty simple when you get right down to it, isn’t it?

In forward selection, you simply do the inverse. That is, using the same criterion, you start with no features, and add in the next most important one until you’ve hit your target number of desired features.

Ranking model training is not carried out globally

It’s also worth noting that the training of the model is carried out category-by-category, and for a specific platform at a time.

In other words, Amazon doesn’t change the entire landscape at once – any changes they make to the search results are confined to a particular category and a particular geography.

This ought to give you some reassurance whenever you hear that the search results are “up and down like the stock exchange” and the like. They may be, but it doesn’t mean that Amazon as a whole has descended into chaos – they’re just making some tweaks to that specific category and platform. 

Of course, it may well feel like chaos if they’re testing a product category on Amazon occupied by many sellers in the US market, for example. 

The more mundane reality is that each country has a team on their local platform who carry out the tests. If you think about it, any other setup would require a team fluent in multiple languages; who could translate from, say, Japanese to Polish to German to Spanish. And I wouldn’t know where to find a single person who can do that – let alone an entire department of them who also happen to be experts in ecommerce-specific data science.

Positive and negative labels

You’ll remember that the Amazon A9 Algorithm is trained on data derived from customer actions. But far from stopping at clicks, add-to-carts and purchases, Amazon uses what can be likened to a “carrot and stick” approach. 

Actions that customers take—clicking on a product, adding items to their cart, making a purchase, etc.—are considered positive examples of customer behavior. These actions serve as valuable data points that inform the models about what works and what doesn’t.

The richness of this data is amplified in digital categories such as Kindle books, Amazon Video, and Amazon Music. In these categories, customer behavior can be tracked with greater precision, providing a greater depth of information that can be used to refine and improve the ranking models.

Central to the ethos of positive labels is that conversion is a continuous process, not an isolated event. It’s intuitively easy to reason that shoppers either buy or they don’t. But in reality, the journey from search to purchase takes place in systematic gradations, “building up” to a purchase through incrementally increasing engagement – clicking on products, reading through the descriptions, zooming in and out of images, and so on.

On the flip side, we have negative labels, which are derived from two types of data points. The first type is what we call ‘negative scenes’. These are products that customers have implicitly rejected. For instance, if a customer clicks on the third product in a list of ten results, it can be inferred that the first two products were not appealing to them, and these can be considered as ‘negative votes’.

The second type of negative data is a bit of a misnomer. Referred to as ‘unseen’, it is actually a random sample drawn from all available products. While these products may not have been shown to customers, they are still important to include in the training data for new models. This ensures that the models are aware of products that may not perform well, even if they are never shown to customers.

Here’s a quick rundown of some examples of positive and negative labels, and their respective benefits in helping the algorithm to serve relevant results:

Positive Labels:

  • Clicked
  • Added to cart
  • Purchased
  • Consumed
 

Benefits:

  • Conversion is a continuous process, and positive labels take account of the complete customer journey.
  • Time Spent On The Page is taken into account; engagement with images, clicks, and scrolling.
  • The aggregate of the data gives Amazon a clear picture of what actually works.
 

Negative Labels:

  • Ignored results
  • Product shown with no action taken
  • Negative vote attribution for unselected products
 

Benefits:

  • Helps understand customer behavior and preferences, through giving Amazon a clear picture of what customers don’t want. 
  • Can be used to improve results and product recommendations, aiding in overall conversions and sales.
 

In essence, the interplay of positive and negative labels forms the bedrock of Amazon’s ranking models, providing a nuanced understanding of customer behavior and preferences. This process drives continuous improvement in product recommendations and overall sales.

Micro credits

Here’s a highly intuitive way of thinking about positive and negative labels. 

First, a quick disclaimer – this is not part of the official literature, and I want to credit (no pun intended) Garfield Coore on the term “micro credits” as that’s where I first heard it.

If we think about a purchase as being a “credit”—a clear, positive signal to the A9 that customers are interested in that product—then the positive labels that lead to the sale are like micro credits.

All of the positive labels we’ve discussed fit within this bracket. Micro credits are all those little things that contribute to the final conversion of a sale, encompassing everything from clicking, to staying on the page for longer, to zooming in on the images.

They help Amazon to more fully understand how customers are engaging with products for a given search term.

The following are examples of positive micro credits:

  • Time on page: The longer a customer stays on the page, the more likely they are to engage with the product.

  • Engagement with images: The more a customer interacts with the images of a product, the more likely they are to be interested in it.

  • Clicks: The more clicks a product receives, the more likely it is to be seen by other customers.

  • Scrolling: Scrolling through the product images and information is a sign of engagement and interest.

  • Add to cart: Adding a product to the cart is a strong indication of interest and a step closer to making a purchase.
 

By the same token, you can do the math on how negative credits like a low engagement rate will detract from a product’s visibility and ranking over time.

To a degree, then, the A9 algorithm works by pitting all of the products in a search against each other based on these metrics, to determine the order in which to rank those products.

In essence:

Maximizing Positive Labels = Better Engagement = Micro Credits = Better Ranking (in principle, at least)

And the real upshot is this: By understanding that all of these little metrics do count towards your final ranking, you can make informed decisions to improve your Amazon product ranking, increase customer engagement, and drive more sales.