Yandex’s AI algorithm

Anastasia Kurmakaeva

Written by Anastasia Kurmakaeva

YandexToday we are going to talk about how the search algorithm of Yandex has evolved over the past few years and its key updates, which have marked a turning point in the way the search engine analyses search queries and returns results based on the users’ needs. Palekh (2016), Korolyov (2017), and Andromeda (2018) rely on the artificial intelligence of neural networks to better understand search intent, going a step further from analysing simple keywords, to understanding their meaning.

Despite Google’s monopoly over most countries all around the world, Yandex’s market share in Russia continues to prevail over the Californian giant. Given the former’s unstoppable expansion and technological development, it doesn’t look like anything is going to change in the years to come.

🎯 According to SEJournal, in 2019 52% of Russian-speaking users still prefer using Yandex, as opposed to 46% of Internet users who choose Google.

In the same interview with the Yandex team published on SEJournal, we’ve also found out that mobile search and voice search penetration is getting increasingly more significant among Russian users, representing, respectively, 56% and 20% of the total.

Palekh

After introducing Palekh in November 2016, Yandex has been further perfecting and refining its search algorithm based on neural networks, to be able to provide answers to more complex search intents and search queries with the help of machine learning, paying special attention to long tail ones. Its first release was limited, as it was only capable of analysing the titles of web pages, but not their content as a whole. It was also considerably slower than its successor (we’ll talk about it in a sec), processing around 40% out of 280 million daily requests made to the search engine.

The “semantic vectors” technology used by Palekh is based on distributional semantics. As they explain it on their blog –in Russian– the words of billions of queries are converted into numbers, or, rather, groups of 300 numbers each. These are distributed amongst a 300-dimensional space, where each document has its own vector. If the numbers corresponding to a query are near the numbers corresponding to a document within the same space, the result is considered to be relevant. The closer they are to each other, the more relevant will be the result the search engine returns to the user.

yandex algortihm palekh
The small locality of Palekh, in Russia, served as inspiration to name the algorithm, which uses its peculiar coat of arms representing a fire bird, thanks to its very distinct long tail.

Yandex distributes long tail keywords into various categories, from less to more specific ones. The most relevant queries and results won’t always have words in common, which indeed makes the search engine’s job more difficult. For example:

  • Search queries, where a person doesn’t remember the name of a movie they’ve seen recently, but one very particular scene got etched in their mind: “film about a man who grows potatoes on another planet” > The Martian.
  • People, more often, kids, who don’t really understand how they’re supposed to use a search engine, and they speak to it as if it were an entity in and of itself: “yandex, please give me recommendations of cool tablet games with fairies” > Their search intent probably could be summed up in a page recommending fantasy games for iOS or Android mobile platforms.

And this, here, is where the algorithm needs to be taught how to understand and to be able to respond to more natural and “human” queries.

Yandex provides the following graphic representation in two dimensions –for us mortals– to explain how Palekh works:

Semantic vector example Yandex

Korolyov

Almost a year later, in August 2017, the next big update of the Yandex’s AI algorithm took place: Korolyov.

Korolyov builds on Palekh, but it’s even more powerful. While the previous update only focussed on the title tag in order to find correspondence between the search term entered by the user and the results, Korolyov reads and analyses the entire content of a page, to return much more accurate results that respond to the user’s search intent. That’s not all: its capability of document processing in real time is multiplied by a thousand times. Moreover, being an AI-based system, its neural network keeps on learning, thanks to a thorough user behaviour analysis, when presented with the results. It compares the current query with other queries, which had previously taken a user to the same content. Or, it takes into account the time a user spent on a page, after having landed there through X query, among other relevance indicators.

On the other hand, the semantic vector calculation is done at the indexing stage, allowing the search engine to establish connections in a quick and efficient way. This makes for considerable savings in resources, as the algorithm only has to process a piece of content once to be able to compare the query vector to content vectors it already knows.

In the same year Korolyov was rolled out, Yandex also launched its AI assistant: Alice. This release boosted voice search use in the search engine.

Andromeda

Yandex AndromedaIn 2018 comes Andromeda. This last update brought new improvements to the search engine, further developing and enriching its intelligent algorithm’s learning capabilities. This makes searching for information much more intuitive and easy for users, and the content provided in the results much more relevant, reliable and coming from better quality sources.

We also see the arrival of new features, like quick answers. This functionality consists in providing direct and clear results to simple queries. For example:

  • When is [holiday]
  • Which football teams are playing today.
  • Cafés near me.

Another new feature is Yandex Experts, where users can ask questions on a variety on topics to real experts if they don’t find an appropriate response to their query in the search results.

Conclusions

What can we learn from the path Yandex has been taking over the last few years? How does it affect SEO in Russia? To put it briefly, we don’t see many differences between Google and Yandex in that respect.

  • Generation of relevant and quality content continues to be vital for a website to prosper. When we create content on our website, we must focus it on our user, not the search engine. What we write must be correct, coherent and valuable.
  • Websites must provide better user experience, work quickly and efficiently to successfully adapt to mobile devices, as we’ve seen in this post. Mobile browsing is also predominant among Russian-speaking Internet users, as in the rest of the world.
  • Voice search will continue to grow.

What new things do you think we can expect from Yandex this year?

Anastasia
Autor: Anastasia
International SEO consultant and translator at the Human Level online marketing agency.

Leave a comment

Your email address will not be published. Required fields are marked *