Structured data equivalent to HTML5 semantic tags

Ramón Saquete

Written by Ramón Saquete

structured data

The fact that there are schema.org structured data that are equivalent –or almost equivalent– to HTML5 semantic tags, makes us wonder whether this information can be important for the Googlebot (remember that Schema.org has been created because of them, in part).

We do not know for certain whether Google takes this type of structured data and HTML5 semantic tags into account, in which case they can only have a positive effect on rankings, because they would allow the crawler to know how pages are organised and answer various questions it needs solved in order to index the content correctly.

Questions like:

  • How do I separate the header and the footer of the main content?
  • Is there any content that isn’t part of the main content and is only a cross section of it?
  • Which groups of links are used to navigate in between various pages and aren’t simply a list of links on a specific page?

Ultimately, it allows us to better identify the main content of our pages, as it’s the content that must rank and stand out on top of any other content on each of our pages.

Tagging this information with HTML5 semantic tags, and at the same time, with their equivalent structured data may seem redundant, but provided that we don’t know the degree of importance Google gives to each of them, we recommend implementing both.

When we don't know whether we should use structured data or HTML5 semantic tags, it's best to use both 👍Click To Tweet

Which are the structured data types that are equivalent to HTML5 semantic tags?

The <header> and <footer> tags are used to tag the header and the footer of any element in the root section or content section in HTML5. When they are used, the most important ones are those, which are set as the direct children of the root section <body>, without intermediate content section elements (<nav>, <article>, <section>, and <aside>. This is because the former indicate that these elements are the main header and footer, repeated sitewide.

Interestingly enough, there are https://schema.org/WPHeader and http://schema.org/WPFooter structured data types, which represent in the same way the website’s header and footer. It wouldn’t be correct to use this type of structured data inside <header> and <footer> tags, which would in turn appear inside a content section element, because this type of structured data is equivalent only on the main <header> and <footer> of the page.

Similarly, we can use the https://schema.org/SiteNavigationElement structured data for all the <nav> elements, but the only important one is the one containing the main menu of the page, although we could also tag all the links inside the footer. For breadcrumbs it is best to use the https://schema.org/BreadcrumbList type, while pagination is currently detected by Google without even specifying what the next and the previous pages are, so in this case we don’t need to apply them either.

The <aside> tag, used on any type of content surrounding the main content (sidebar or not), we have https://schema.org/WPSideBar. On the other hand, if the element only contains ads, the correct thing would be to use https://schema.org/WPAdBlock in its stead.

For <body> we have the http://schema.org/WebPage type. Tagging the body of the document doesn’t seem very useful, but there are types of structured data deriving from the former, which can be used to specify what type of page it is. For example, a FAQ page (https://schema.org/FAQPage), a contact page (https://schema.org/ContactPage), an About page (https://schema.org/AboutPage), and so on.

But the most important thing is that the type https://schema.org/WebPage has several properties imitating the meaning of the <main> tag, useful for tagging the main content of the page.

The structured data equivalent to the <main> tag are:

  • mainContentOfPage property of https://schema.org/WebPage, which should be of type https://schema.org/WebPageElement, i.e. a wild card structured data type applicable to any tag type.
  • The https://schema.org/WebPage type also has the mainEntity property, to tag the type of structured data representing the main content of the page (in cases for which said content uses some specific type of structured data).
  • Alternatively, and in an equivalent manner to mainEntity and https://schema.org/WebPage, inside the main structured data type of the page we can use the mainEntityOfPage property, which will take the current URL as its value. Google recommends using this property with the https://schema.org/Article type, when it’s the main content of the page.

Later in this post we will see each of these points with an example.

Finally, the old <table> element also has its equivalent structured data type, which is https://schema.org/Table.

General recommendations for implementing these types of structured data

JSON-LD

We are going to see a usage example of this type of structured data with JSON Linked Data, which is the Google-recommended format. To do this, let’s imagine we have several ad blocks on the same topic scattered throughout the page. In this case, we can use the same class for all of them, and associate it to the WPAdBlock type, using the following JSON+LD code:

…
Page content …
<aside class="ad">Ad 1</aside>
Page content …
<aside class="ad">Ad 2</aside>
…
<script type="application/ld+json">
{
  "@context": "http://schema.org/",
  "@type": "WPAdBlock",
  "cssSelector": ".ad"
}
</script>

With the cssSelector property we specify the CSS selector that assigns the current type of data to the selected element (it’s also possible to specify an XPath expression with the xpath property).

WPAdBlock JSON-LD
In this result of Google Structured Data testing tool we can see that what we are expressing with this implementation is: even though visually they are two separate blocks, they belong to the same ad block semantically.

If we wanted to set two semantically independent blocks, for example, sidebars with two different intents, we would tag them separately:

<body>
<main>
<aside id="interiorSidebar">Main content sidebar</aside>
</main>
<aside id="exteriorSidebar">Website sidebar</aside>
….
<script type="application/ld+json">
{
"@context": "http://schema.org/",
"@type": "WPSidebar",
"cssSelector": "#interiorSidebar"
}
</script>
<script type="application/ld+json">
{
"@context": "http://schema.org/",
"@type": "WPSidebar",
"cssSelector": "#exteriorSidebar"
}
</script>
WPSidebar JSON-LD
This time, the structured data tool displays the type of data for each container.

We can apply this implementation to all types deriving from WebPageElement (they all inherit the cssSelector and xpath properties): SiteNavigationElement, Table, WPAdBlock, WPFooter, WPHeader y WPSideBar. We should be careful of creating one unique associated element for WPHeader and WPFooter.

Microdata

We can also implement this type of data with microdata, in the following way:

<header itemscope itemtype="https://schema.org/WPHeader" id="header">
...
</header>

This implementation is shorter in cases where the data type appears only once, like with the header and the footer. It is also the way in which WordPress implements these structured data. However, the Google structured data testing tool doesn’t detect this type of implementation if the tagged element doesn’t carry an identifier, as seen in the example (id=”header”). Moreover, with this implementation, the tool doesn’t assign any type of data to it, while with JSON-LD it takes the content itself as its value, so the JSON-LD implementation appears to be more reliable.

Microdata vs JSON-LD
With the JSON-LD implementation we can be certain what content is associated to the data type, although both ways probably have the same effect.

Implementation of the main content (<main> tag) with structured data

JSON-LD

Let’s see an example of how to use the mainEntityOfPage mainEntityOfPage property:

<script type="application/ld+json">
  "@context": "https://schema.org",
  "@type": "Article",
  "mainEntityOfPage": "https://www.humanlevel.com/articulos/indexabilidad/datos-estructurados-equivalentes-a-etiquetas-semanticas-de-html5.html",
  "author": "...", 
  ...
</script>

If we encapsulate this structured data type inside WebPage, we could express exactly the same thing with mainEntity in the following way:

<script type="application/ld+json">
{
"@context": "https://schema.org",
"@type": "WebPage",
  "mainEntity": {
    "@type": "Article",
    "author": "...",
    ...
  }
}
</script>

If we don’t have any structured data type that we can associate to the main content of the page, we can use WebPage and mainContentOfPage to implement it thus:

<main id="principal">
    Lorem ipsum
</main>
<script type='application/ld+json'>
{
"@context": "https://schema.org",
"@type": "WebPage",
  "mainContentOfPage":{
    "@type": "WebPageElement",
    "cssSelector": "#principal"
  }
}
</script>

Microdata

Now, we are going to see how the previous implementations work with microdata:

mainEntityOfPage property:

    <main itemscope itemtype="http://schema.org/Article">
         <meta itemprop="mainEntityOfPage" content="https://www.humanlevel.com/articulos/desarrollo-web/como-interpretar-schema-org-para-crear-datos-estructurados.html"/>
         <p itemprop="author">…</p>
         …
    </main>

mainEntity property:

<body itemscope itemtype="http://schema.org/WebPage">
     <main itemprop="mainEntity" itemscope itemtype="http://schema.org/Article">
        <p itemprop="author">…</p>
        …
     </main>
</body>

mainContentOfPage property:

<body itemscope itemtype="http://schema.org/WebPage">
    <main itemprop="mainContentOfPage" itemscope itemtype="http://schema.org/WebPageElement">
        …
    </main>
</body>

Conclusion

Structured data contribute more information than HTML5 tags in some cases, and it’s more likely that Googlebot will take them into account, when looking into how a page is organised, and even more so, to find out what its main content is, as it is the most important part rankings-wise and it should stand out among the rest of the elements. That’s why it’s important to always have structured data implemented, preferably with JSON-LD.

Additional references

Ramón Saquete
Autor: Ramón Saquete
Web developer at Human Level Communications online marketing agency. He's an expert in WPO, PHP development and MySQL databases.

Leave a comment

Your email address will not be published. Required fields are marked *