Sitemap

Sandra López

Written by Sandra López

What is a sitemap?

The sitemap is a file with a list of the pages that make up a website. Sitemap files facilitate the discovery and crawling of pages for search engine robots. They are also useful for comparing the number of pages indexed with the number of pages submitted in the sitemap, as we see in Google Search Console. The URLs included in a sitemap must correspond to current pages that we consider important to index. That is why they must be kept up to date.

The search engines that use this Sitemaps protocol are: Google, pioneer in creating Google Sitemaps, Yahoo!, Bing, Ask.com and MSN.

The format to use when creating this file is .XML, although other text files and RSS web feeds are also supported and read by search engines. It lists the URLs that we want to index the website and is usually accompanied by metadata that provide essential information to search engines about the date of the last update, hierarchy of URLs and the frequency of modification of the same.

Advantages of the sitemap

They help search engines to optimally and intelligently crawl our website, being able to reach, thanks to their hierarchy, deeper levels and thus index the content we want to have in Google.

Therefore, they improve indexing, so that your content can be available to the user in search engines when they deem it appropriate to display them because they consider them relevant to the searches performed.

Normally a well-linked, well-structured and hierarchically well-constructed website architecture can be crawled without any problem by the bots of search engines, but it becomes essential to have it when it is a very large website, in which they may not be able to crawl very deeply or poorly linked pages; for new sites with few external links and that we want to index as soon as possible; or if you have a news or other interactive support content website so that Google can categorize and manage them in an optimal way.

They facilitate the work of crawlers by indicating with metadata the type of content contained in each URL or group of URLs, by being able to specify in the sitemap the type of files to be found: videos, images or mobile.

How to create a sitemap

XML sitemap format with UTF-8 encoding

The first thing we will recommend is to use the .XML data tag with UTF-8 encoding, as we can see in the first line of code of the URL that contains it. The example shows the specification of the format and encoding we recommend, as well as a sample of the URL list it contains.

Sitemap file source code

Thanks to structured data, we can add additional data to Google that specifies items such as the time of the last update (<lastmod>), the tracking frequency (<changefreq>), hierarchical priority (<priority>) and the URL in question

The image shows how the sitemap can be generated with a CMS plugin, that is, a small program for the content manager with which we create the website that automatically provides us with the sitemap. It is advisable to check its operation and not to leave everything to the automatism.

The URL containing the sitemap, will be similar to the one shown below.

Example sitemap.xml

Other formats we can use

In addition to the format we have recommended above, you can create your sitemap with other supported formats:

  • RSS, mRSS and Atom 1.0, in case you have a blog with RSS or Atom feed, it is advisable to submit the URL of the feed in question in the sitemap.
  • Text, by sending a .txt text file containing the URLs, one per line. Do not forget to encode the file with UTF-8.
  • Google Sites, if you elaborate the web with it, will automatically generate the sitemap to send to Google.

Other Sitemap types

If your web site usually has different multimedia contents, you should use the extensions mentioned above to indicate to the search engine the existence of these multimedia content groupings:

  • Video sitemaps inform the search engine of the video contents we have, it categorizes them as such and they can then be found in the search results of Google videos, which will offer the result along with a thumbnail of the video that we will provide or generate automatically.

xmlns:video="http://www.example.com/schemas/sitemap-video/1.1"

  • Sitemaps of images, indicate to the search engine the image contents of the web, which will appear in Google’s image search results since the sitemap will help you with this content that could not be identified per se, facilitating its indexation. You can include up to 1,000 images per sitemap.

xmlns="http://www.example.com/schemas/sitemap-image/1.1"

  • News Sitemaps, Google News, containing in it only the URLs of the articles published in the last two days. It reports news content that will appear in search results as such and will remain in the Google News index for 30 days. It can contain up to 1,000 URLs. Do not create new sitemaps when including news, but update the one you have uploaded. It is very important to have the portal registered in the Google News Publisher Center.

xmlns:news="http://www.example.com/schemas/sitemap-news/0.7

  • Sitemaps for low-end phones, which you should create only if you have a version of the page for these devices. We must include the tag <mobile:mobile/> to ensure indexing.

xmlns:mobile="http://www.example.com/schemas/sitemap-mobile/1.0

Best practices in sitemap

You should bear in mind that the sitemap must contain all the URLs that we want to index and that offer a 200 server response code, so you must send it in the correct formats indicated above and include in them the site’s own URLs, with the correct format and that do not include session identifiers.

If your portal is multilingual, include the canonical URLs for each language and indicate to Google the corresponding URLs in the other languages with the hreflang tags .

Remember that each sitemap must contain a maximum of 50,000 URLs (except in the cases of the Sitemaps mentioned in the previous point), in addition to not exceeding 10 MB in size (uncompressed), so if you include several sitemaps for a single website, it is advisable to create and submit to Google a sitemap index file that includes all of the .

Send the sitemap to Google, with the robots.txt file of the page, by specifying it in its content.

Sitemap in robots.txt

You can also do this by using the Google Search Console tool, where you can not only submit sitemaps, but also check them. On the platform you can also check the saturation index in sitemaps, i.e. the ratio between the pages we submit to Google and those that are finally indexed. The objective of indexability will be that the saturation in sitemaps tends to 100%, with all pages available, indexed.

Sitemaps in Google Search Console

Locate the sitemap in the root of the server with the name sitemap.xml at the time of hosting it on the server and update them regularly, so that there is consistency between the content that we indicate to the search engine to index and the existing content on our website.

Sources

Managing your Google sitemap

  •  | 
  • Last modified on
Sandra López
Sandra López
Former Senior SEO consultant at Human Level. Graduated in Advertising and Public Relations. She has a Master's Degree in Marketing and Consumer Behavior, a Master's Degree in Professional SEO/SEM and a Master's Degree in Technical SEO. She also completed an advanced course in Web Design and Development. Specialist in media SEO.

Related Posts

en