Indexable and accessible AJAX

Ramón Saquete

Written by Ramón Saquete


Accessible and indexable Ajax with HTML5
AJAX
(Asynchronous Javascript and XML) is a combination of technologies that allows reloading parts of a page, thus avoiding having to reload an entire page. This technology can always be implemented in a way that makes it indexable, and with HTML5, we can make it accessible.

To illustrate the explanation, let’s suppose we have a main menu with header, footer and page contents that are loaded by AJAX, without having to reload the header and footer every time a different content is loaded. When reading these lines, many of you will remember that this was done in the past (at the end of the 90’s and the beginning of the 21st century) with frames, a technique that became obsolete because it caused many problems. Let’s see if it is possible, a decade later, to apply AJAX to solve this assumption without having indexability and accessibility problems.

Indexable AJAX

For an AJAX link to be indexable we must always have two URLs, one that will return only the part of the page that has to be repinned for users and another that will return the entire page for crawlers and for users who have Javascript disabled.

In the proposed example let’s suppose that in the menu we have an option to go to the Blog, so we will have two links:

https://www.humanlevel.com/blog.html → This URL will load the entire page with the entries and will be indexed.

https://www.humanlevel.com/ajax/blog.php → This URL will load only blog entries.

In the menu, the link will appear as follows:
<a href="https://www.humanlevel.com/blog.html" id="menuLinkBlog">Blog</a>

Since the link has an indexable URL associated with it, when the Google spider arrives it will have no problem following the link.
But what will happen when the user clicks? Well, as good developers, we will have implemented a Javascript function that will be associated to the click event of the link with identifier “menuLinkBlog”. This Javascript function will load via AJAXIf the URL https://www.humanlevel.com/ajax/blog.php is set within the main content of the page, the result of the URL https://www.humanlevel.com/ajax/blog.php with the list of entries will prevent the default action of the link from being executed, so it will not make the jump to https://www.humanlevel.com/blog.html.

By adding the following line to the robots.txt file, we prevent, in case the spider is able to execute Javascript, the URLs of asynchronous requests in the /ajax directory from being indexed:

Disallow: /ajax/

Update: this solution is no longer applicable, now the best option is to make the calls with the POST method to avoid indexing.

Accessible AJAX

In addition to making the URL https://www.humanlevel.com/blog.html indexable, we want it to be accessible so that it appears in the browser bar when clicking on the link, allowing users to share it, bookmark it and also browse the history by clicking back and forth.

HTML5 Javascript solution (pushState)

In order to make it possible for the browser URL to change without going to another page, the Javascript pushState method has been created:


history.pushState(null, "Blog de HumanLevel", "https://www.humanlevel.com/blog.html");

The first parameter is to pass data when changing the URL, the second to change the title of the page (which could already be done with document.title) and the last one indicates the new URL to which we are going to go.

In addition, we will have to associate a function to the popstate event. This will be triggered when we move through the history and will load, by AJAX, the content corresponding to the friendly URL we are in, without reloading the whole page.

This is a great solution as a WPO (Web Page Optimization) technique and it can also save memory on the server, as we can maintain, on the client, the state of the variables that do not affect security. The only disadvantages are the increased development time and that it is still too early to use it because of the hated Internet Explorer. This is because Internet Explorer does not implement pushState until version 10 and, although version 11 will be released in a few months, versions 8 and 9 are so widespread right now that it is not viable to use this fantastic technique, which will revolutionize the Web loading time.

Solution with fragments

What happens if the browser (Internet Explorer) does not support pushState? So, we can detect this situation from our code and, if we don’t care too much about the drawbacks of the fragment solution, implement it.

When you type the hash character # at the end of a URL, followed by a word, you are adding a fragment to this URL (not to be confused with the Twitter hashtags).

Well, it turns out that without using the new pushState method, using the location.hash=‘blog’ property, you can modify this in the URL, and also, as with the popstate event, with the hashchange event, we can detect if we move through the history.

In this way, using Javascript, when clicking on the link in the previous example, the URL will change to something like this:

https://www.humanlevel.com/#blog.html

The problem with this is that when they share our URL, they will not be generating popularity to the blog page, but to https://www.humanlevel.com/. This is due to the fact that when making the request to the server, the URLs with fragment are sent without it. The URL of the page without the snippet will be the one that will be retrieved and indexed by the Google spider when it finds the link on another page.

However, users will be able to see the content of the blog since, through AJAX, once the page returned by the server is loaded, the content should change to the blog content. This will increase the loading time considerably for users coming from external links with snippets. Also, in the rare case that the user has Javascript disabled, the same page will not load.

Google’s bad solution with hashbang

A few years ago Google published a specification called AJAX application crawlable to make applications programmed with AJAX indexable and accessible at the same time.

This specification did not convince many of us developers. The proof that it was not a good idea can be found in Twitter, which implemented it and then abandoned it, I think I remember, between 2010 and 2012.

The idea is to have links with URLs like the one in the following example:

<a href="https://www.humanlevel.com/#!blog.html">Blog</a>

In this case, what the Google spider does is that when it finds the hashbang (#!), instead of sending the request to the server with the URL https://www.humanlevel.com/ as the standard dictates, it bypasses all the principles of good engineering and makes a request to the server like this one:

https://www.humanlevel.com/?_escaped_fragment_=blog.html

In the server code we will have to have programmed this request to return the complete page of the list of blog entries as if it were the URL https://www.humanlevel.com/blog.html of the previous examples and, again with Javascript, we will control the AJAX loading of the content for the browsers, for which in this case we would have to implement only the logic of the hashchange event.

The problems with this solution are:

  • Increases development time.
  • It is very easy to make a bad implementation generating duplicate content.
  • This solution is valid only for Google and Bing. The latter has also adopted it but only by activating it from its webmaster tool.
  • Instead of the friendly URL https://www.humanlevel.com/blog.html, the search engine will index the URL https://www.humanlevel.com/#!blog.html. Although Google says that this aberration is a nice URL, for me it is quite debatable.
  • As in the previous case, the loading time increases for users accessing directly to the “nice” URL https://www.humanlevel.com/#!blog.html, since we must load the page and then by Javascript launch another request to the server to load the part that comes by AJAX.
  • You have to deal with encoding transformations that are made between the original fragment and the querystring that is sent in the transformed URL.
  • Users with Javascript disabled will not see the same page.

Accessibility for the disabled with WAI-ARIA

You can’t talk about web accessibility and Javascript without at least mentioning the WAI (Web Accessibility Initiative) standard, called ARIA (Accessible Rich Internet Applications). This standard is primarily intended to make Javascript-intensive Web applications accessible to disabled users who can only use the keyboard or who have to use a screen reader such as Jaws or NVDA.

This standard adds a series of attributes to the HTML, which define semantic information about the functionalities of each element and their states, which must be manipulated with Javascript while the application is being used, so that the operating system’s accessibility API interprets them and passes the information to the user’s screen reader.

WAI-ARIA is part of the HTML5 specification. and, according to the W3C, it must be used taking into account that the semantic tags of HTML5 and, within three years, of HTML5.1, must be used over WAI-ARIA attributes when there are equivalences, although in practice both are sometimes used when using a new HTML5 tag that is not yet supported by browsers, as is the case, for example, with the new HTML5 main tag that is usually accompanied by the WAI-ARIA role=”main” attribute.

The Web is evolving for the better. The HTML5 pushState method and WAI-ARIA allow faster and more accessible pages to be implemented without affecting SEO. However, we will only be able to enjoy the new technologies when the old versions of Internet Explorer disappear.

  •  | 
  • Last modified on
Ramón Saquete
Ramón Saquete
Web developer and technical SEO consultant at Human Level. Graduated in Computer Engineering and Technical Engineering in Computer Systems. He is also a Technician in Computer Applications Development and later obtained the Pedagogical Aptitude Certification. Expert in WPO and indexability.

What do you think? Leave a comment

Just in case, your email will not be shown ;)

en