Written by Ramón Saquete
Table of contents
The fact that there are schema.org structured data that are equivalent –or almost equivalent– to HTML5 semantic tags, makes us wonder whether this information can be important for the Googlebot (remember that Schema.org has been created because of them, in part).
We do not know for certain whether Google takes this type of structured data and HTML5 semantic tags into account, in which case they can only have a positive effect on rankings, because they would allow the crawler to know how pages are organised and answer various questions it needs solved in order to index the content correctly.
Questions like:
- How do I separate the header and the footer of the main content?
- Is there any content that isn’t part of the main content and is only a cross section of it?
- Which groups of links are used to navigate in between various pages and aren’t simply a list of links on a specific page?
Ultimately, it allows us to better identify the main content of our pages, as it’s the content that must rank and stand out on top of any other content on each of our pages.
Tagging this information with HTML5 semantic tags, and at the same time, with their equivalent structured data may seem redundant, but provided that we don’t know the degree of importance Google gives to each of them, we recommend implementing both.
When we don't know whether we should use structured data or HTML5 semantic tags, it's best to use both 👍Click To TweetThe <header> and <footer> tags are used to tag the header and the footer of any element in the root section or content section in HTML5. When they are used, the most important ones are those, which are set as the direct children of the root section <body>, without intermediate content section elements (<nav>, <article>, <section>, and <aside>. This is because the former indicate that these elements are the main header and footer, repeated sitewide.
Interestingly enough, there are https://schema.org/WPHeader and http://schema.org/WPFooter structured data types, which represent in the same way the website’s header and footer. It wouldn’t be correct to use this type of structured data inside <header> and <footer> tags, which would in turn appear inside a content section element, because this type of structured data is equivalent only on the main <header> and <footer> of the page.
Similarly, we can use the https://schema.org/SiteNavigationElement structured data for all the <nav> elements, but the only important one is the one containing the main menu of the page, although we could also tag all the links inside the footer. For breadcrumbs it is best to use the https://schema.org/BreadcrumbList type, while pagination is currently detected by Google without even specifying what the next and the previous pages are, so in this case we don’t need to apply them either.
The <aside> tag, used on any type of content surrounding the main content (sidebar or not), we have https://schema.org/WPSideBar. On the other hand, if the element only contains ads, the correct thing would be to use https://schema.org/WPAdBlock in its stead.
For <body> we have the http://schema.org/WebPage type. Tagging the body of the document doesn’t seem very useful, but there are types of structured data deriving from the former, which can be used to specify what type of page it is. For example, a FAQ page (https://schema.org/FAQPage), a contact page (https://schema.org/ContactPage), an About page (https://schema.org/AboutPage), and so on.
But the most important thing is that the type https://schema.org/WebPage has several properties imitating the meaning of the <main> tag, useful for tagging the main content of the page.
The structured data equivalent to the <main> tag are:
- mainContentOfPage property of https://schema.org/WebPage, which should be of type https://schema.org/WebPageElement, i.e. a wild card structured data type applicable to any tag type.
- The https://schema.org/WebPage type also has the mainEntity property, to tag the type of structured data representing the main content of the page (in cases for which said content uses some specific type of structured data).
- Alternatively, and in an equivalent manner to mainEntity and https://schema.org/WebPage, inside the main structured data type of the page we can use the mainEntityOfPage property, which will take the current URL as its value. Google recommends using this property with the https://schema.org/Article type, when it’s the main content of the page.
Later in this post we will see each of these points with an example.
Finally, the old <table> element also has its equivalent structured data type, which is https://schema.org/Table.
General recommendations for implementing these types of structured data
JSON-LD
We are going to see a usage example of this type of structured data with JSON Linked Data, which is the Google-recommended format. To do this, let’s imagine we have several ad blocks on the same topic scattered throughout the page. In this case, we can use the same class for all of them, and associate it to the WPAdBlock type, using the following JSON+LD code:
… Page content … <aside class="ad">Ad 1</aside> Page content … <aside class="ad">Ad 2</aside> … <script type="application/ld+json"> { "@context": "http://schema.org/", "@type": "WPAdBlock", "cssSelector": ".ad" } </script>
With the cssSelector property we specify the CSS selector that assigns the current type of data to the selected element (it’s also possible to specify an XPath expression with the xpath property).
If we wanted to set two semantically independent blocks, for example, sidebars with two different intents, we would tag them separately:
<body> <main> <aside id="interiorSidebar">Main content sidebar</aside> </main> <aside id="exteriorSidebar">Website sidebar</aside> …. <script type="application/ld+json"> { "@context": "http://schema.org/", "@type": "WPSidebar", "cssSelector": "#interiorSidebar" } </script> <script type="application/ld+json"> { "@context": "http://schema.org/", "@type": "WPSidebar", "cssSelector": "#exteriorSidebar" } </script>
We can apply this implementation to all types deriving from WebPageElement (they all inherit the cssSelector and xpath properties): SiteNavigationElement, Table, WPAdBlock, WPFooter, WPHeader y WPSideBar. We should be careful of creating one unique associated element for WPHeader and WPFooter.
Microdata
We can also implement this type of data with microdata, in the following way:
<header itemscope itemtype="https://schema.org/WPHeader" id="header"> ... </header>
This implementation is shorter in cases where the data type appears only once, like with the header and the footer. It is also the way in which WordPress implements these structured data. However, the Google structured data testing tool doesn’t detect this type of implementation if the tagged element doesn’t carry an identifier, as seen in the example (id=”header”). Moreover, with this implementation, the tool doesn’t assign any type of data to it, while with JSON-LD it takes the content itself as its value, so the JSON-LD implementation appears to be more reliable.
Implementation of the main content (<main> tag) with structured data
JSON-LD
Let’s see an example of how to use the mainEntityOfPage mainEntityOfPage property:
<script type="application/ld+json"> "@context": "https://schema.org", "@type": "Article", "mainEntityOfPage": "https://www.humanlevel.com/articulos/indexabilidad/datos-estructurados-equivalentes-a-etiquetas-semanticas-de-html5.html", "author": "...", ... </script>
If we encapsulate this structured data type inside WebPage, we could express exactly the same thing with mainEntity in the following way:
<script type="application/ld+json"> { "@context": "https://schema.org", "@type": "WebPage", "mainEntity": { "@type": "Article", "author": "...", ... } } </script>
If we don’t have any structured data type that we can associate to the main content of the page, we can use WebPage and mainContentOfPage to implement it thus:
<main id="principal"> Lorem ipsum </main> <script type='application/ld+json'> { "@context": "https://schema.org", "@type": "WebPage", "mainContentOfPage":{ "@type": "WebPageElement", "cssSelector": "#principal" } } </script>
Microdata
Now, we are going to see how the previous implementations work with microdata:
mainEntityOfPage property:
<main itemscope itemtype="http://schema.org/Article"> <meta itemprop="mainEntityOfPage" content="https://www.humanlevel.com/articulos/desarrollo-web/como-interpretar-schema-org-para-crear-datos-estructurados.html"/> <p itemprop="author">…</p> … </main>
mainEntity property:
<body itemscope itemtype="http://schema.org/WebPage"> <main itemprop="mainEntity" itemscope itemtype="http://schema.org/Article"> <p itemprop="author">…</p> … </main> </body>
mainContentOfPage property:
<body itemscope itemtype="http://schema.org/WebPage"> <main itemprop="mainContentOfPage" itemscope itemtype="http://schema.org/WebPageElement"> … </main> </body>
Conclusion
Structured data contribute more information than HTML5 tags in some cases, and it’s more likely that Googlebot will take them into account, when looking into how a page is organised, and even more so, to find out what its main content is, as it is the most important part rankings-wise and it should stand out among the rest of the elements. That’s why it’s important to always have structured data implemented, preferably with JSON-LD.