What JavaScript technologies are indexable with Googlebot evergreen?

Ramón Saquete

Written by Ramón Saquete

In May 2019, Googlebot stopped using the JavaScript engine from Chrome 41, to keep it always updated to the latest version. Now it’s called Googlebot evergreen. Although this change certainly improves the situation, there are JavaScript technologies that Googlebot still cannot run, and we must take them into consideration to be able to index all content instead of an error message.

As you already know, Googlebot doesn’t always index JavaScript, but when it does, there are technologies it cannot run. Let’s see what we need to keep in mind to deal with these situations.

Googlebot evergreen

What is the recommended way to implement JavaScript features?

In general, we should always detect which technologies the browser allows us to use, and, in case of not being supported, there should be some alternative content, accessible for the user and the bot. Let’s imagine that instead of alternative content we display an error page. If Googlebot executes the JavaScript without enabling the use of this technology, it will index an error message.

If we use more advanced features, we should always test them by disabling them in our browser (this can be done from chrome://flags on Chrome), and see how the page behaves in these cases.

In terms of implementation, developers can follow one of these two strategies:

  • Progressive enhancement: it consists in starting from a basic content with text and images, and progressively adding more advanced features.
  • Graceful degradation: it consists in starting from a content with advanced features, and rolling back to more basic features, until reaching the most basic content.

Google recommends progressive enhancement, because this way we’re less likely to leave any situation uncovered. This is where the Progressive Web Applications (PWA) name comes from. It is also recommended to implement the onerror event to our website to return JavaScript errors to the server through AJAX. This way, they will be stored in a log, which will allow us to see if there’s an issue with page rendering.

How do we test Googlebot’s behaviour?

Google updated the version of Googlebot not only for its crawler, but also for all its tools: URL Inspection in Google Search Console, Mobile Friendly Test, Rich Results test, and AMP test.

This means that if we want to see how a page is rendered with the latest Googlebot rendering engine or Web Rendering Service (WRS), as well as Javascript errors, we can either go to Google Search Console and its “Inspect URL”, and after the analysis, click on “Test live URL”; or we can do it using the mobile friendly test tool.

Are there any technologies that will stop working with each new update of the Googlebot?

Sometimes, Google Chrome removes features with its updates (you can stay up to date with them here and here). These are features, which are usually experimental, and disabled by default. For example, WebVR API for virtual reality became obsolete in version 79, to give way to WebXR API for augmented and virtual reality.

But what happens when the feature isn’t that experimental? This is the case of Web Components v0: when Google announced that it was going to remove them with the version 73 (Chrome was the only browser supporting them, and Web Components v1 are supported by all browsers) the “early adopters” requested Google to give them more time to update their code. This delayed their complete removal until February 2020. Nevertheless, it is uncommon for a website to stop rendering correctly as a result of a Googlebot update. Google engineers always try to maintain backward compatibility in their APIs, meaning we usually needn’t worry about this. In any case, if our website implements the onerror event, as we mentioned earlier, we will be able to see in our logs if there is an error.

If we want to be extremely cautious, we can locate the URLs of each template used on our website, and test them all with each Googlebot update, using the mobile friendly test. In any case, it’s good practice to always test the rendering of all templates with each update of our website’s code.

How does Googlebot evergreen affect polyfills and transpiled code?

When a JavaScript feature doesn’t work on all browsers, developers use polyfills, which fill the code not implemented by the browser with quite some additional JavaScript code.

Having the latest update of the Googlebot engine, it won’t be necessary to load polyfills to get the bot to correctly see the page, but we should do it if we wish that users with incompatible browsers do see it correctly. If we want to support modern browsers only, with Googlebot evergreen we can use less polyfills and the website will load faster.

Similarly, it may not be necessary to transpile or convert JavaScript code to a previous version to give support to Googlebot (unless we’re using a very modern version of JavaScript, or an extended one like TypeScript). Again, it will be necessary to support all browsers, old and new.

To which JavaScript technologies should we pay special attention?

With the switch from Chrome 41 to Googlebot evergreen, there are technologies that used to cause rendering errors in pages that now are working correctly. We’re talking about Web Components v1, CSS Font loading API to choose how we load fonts, WebXR for virtual and augmented reality, to WebGL for 3D graphics, WebAssembly to run code as fast as if it were on a native application, Intersection Observer API and loading=”lazy” attribute to apply the lazy loading technique to images and new features added to EcmaScript 6 JavaScript standard.

Page with WebGL as rendered by Googlebot
Page with WebGL as rendered by Googlebot

Be careful with some of these technologies: when they’re used, we should always verify that Googlebot renders them, in case there is still any feature left that hasn’t been implemented by the bot. Moreover, even if it’s rendered, it doesn’t mean that Google will be able to index content displayed inside a 3D or augmented reality image, instead, Googlebot will simply not block its rendering and it won’t return an error.

There are technologies for which Googlebot doesn’t return an error, but it still doesn’t use them for obvious reasons, because Googlebot is not a user, but a crawler. These technologies are:

  • Technologies requiring user permissions: for example, when we display a specific type of content, dependent upon a user’s location, through the Navigator.geolocation API. When denied this permission, we will present an error message. This error will get indexed because Googlebot, by default, denies all permission requests, so our best bet is to display a warning and generic content.
  • Service Workers: are installed on the browser with the first request, to provide PWA services, such as saving pages on the cache, for offline browsing or push notification purposes. These features make no sense for a crawler. As a consequence, Googlebot simply ignores them.
  • WebRTC: it wouldn’t make any sense for the bot to index content of a technology useful for P2P communication between two browsers. Usually, this is used by web applications like Skype or Hangouts once the user has signed in, but Googlebot doesn’t sign in or has video conferences.
  • WebSockets: web sockets are used for sending content updates from the server, without being requested from the browsers. This allows us to implement chats or content updates parallel to the user’s browsing. They can also be used as a replacement to AJAX. Googlebot doesn’t allow connection to websockets, so if the main content of the website is loaded using WebSockets, it won’t be indexed even if Googlebot is able to run JavaScript. To find out whether websockets are being used on a website, we can go to the WS tab of Chrome Dev Tools:
Google Chrome Dev Tools: websocket

Again, this doesn’t mean that Googlebot doesn’t support this technology, it simply doesn’t use it because it doesn’t allow connections.

[img]

Is AJAX allowed?

When Googlebot runs JavaScript, it doesn’t have any problem with running Fetch API or XmlHTTPRequest used to implement AJAX. Be careful, though: each AJAX request counts for crawl budget (if we have too many, the indexing of our website will take a hit).

On the other hand, if during the initial load we recover content through AJAX, it won’t be indexed if Google doesn’t have render budget to run JavaScript on this page.

Googlebot implements websockets but doesn't allow connection to them.
Googlebot implements websockets but doesn’t allow connection to them.

How does Google behave with session variables and client data?

To maintain status between requests on a website (to maintain status means to know which actions a user performed previously, like signing in or adding products to a shopping basket) we can use cookies to store the ID, called session cookie. This value identifies a memory slot where session variables can be stored on the server.

However, using SessionStorage, LocalStorage and IndexedDB APIs we can store information on the browser to maintain status, and in cases of SPAs (Single Page Applications) we can store the status in the client’s memory by simply using JavaScript variables.

If Googlebot’s behaviour with cookies is based on the fact that with each request it’s like a new user without any cookie, with SessionStorage and other technologies the status stays the same.

Although Googlebot supports SessionStorage, LocalStorage and IndexedDB, and if we use these technologies in our development, it won’t return an error, on each request its storage is empty, as if it was a new user, meaning it won’t index anything state-related, and URLs will always carry the same content. Similarly, the bot doesn’t browse by clicking the links of a SPA, loading each page completely from the beginning, initialising all JavaScript code and its variables with each request.

Conclusions

It’s always best when all the content of a website is indexable, without using JavaScript. However, if this technology is used for features providing added value to the user, we must be careful with the way in which they are implemented, to prevent rendering errors and error message indexing.

Ramón Saquete
Autor: Ramón Saquete
Web developer at Human Level Communications online marketing agency. He's an expert in WPO, PHP development and MySQL databases.

Leave a comment

Your email address will not be published. Required fields are marked *