Written by Ramón Saquete
- What is HTTPS?
- Why should you use it on your website?
- Why shouldn’t you use it on your website?
- How does encryption work?
- What type of SSL certificate to choose?
- What should be taken into account in terms of SEO in order not to lose traffic during a migration?
- What should the system administrator take into account in order not to lose traffic?
- Can I use HTTP2 at the same time as HTTPS?
- Is HTTPS really secure?
- Additional references
Ever since Google announced in 2014 that the use of HTTPS was going to to favor SEO. Many websites are implementing it. But, do you really know what Hypertext Transfer Protocol Secure is? Is it really secure for you and your visitors? Have you ever wondered if your website needs it? Do you know how to contract it? And, above all, do you know how to do a migration to HTTPS so that it affects SEO as little as possible? You will find the answers to these questions and many more if you read on.
What is HTTPS?
Imagine that five people are involved in the process of sending a letter: the first person writes the content of the letter and a series of additional data in the header that indicate things related to that content, such as the size of the letter, its date, the language, etc. We can give this person the “strange name” of HTTP sender.
HTTP sender passes the letter to the second party (SSL sender or TLS sender), which encrypts the content by replacing some letters with others.
SSL or TLS then passes the letter to the third party (TCP sender) who, among other things, breaks the letter into pieces and numbers them because the letter carrier does not like to carry very large letters.
The sender TCP then gives each piece of the letter to the sender IP, who takes each piece and puts it in an envelope where he writes the destination address and the sender.
Finally, the sender IP hands each envelope to the fifth and last person, which could be called Ethernet, PPP… who assigns each letter to the nearest mailbox for the letter carrier to take it to its destination.
When the letter arrives at its destination, it follows the reverse process, i.e., Mr. Ethernet receiver picks up the letters, passes them to IP receiver who takes the pieces out of the envelopes and gives them to TCP receiver who puts them back together. Then, the receiving SSL decrypts the content and finally a receiving HTTP looks at the headers to know how it has to interpret the content.
As you may have noticed, if in this description you change person for communication protocol and letter for HTML document, what you have is the operation of the HTTPS protocol. The operation of the HTTP protocol would be exactly the same but eliminating the SSL/TLS protocol.
In the following image I graphically illustrate the journey of this HTML document, where the sender and receiver can be, respectively, the web server and the browser or vice versa:
HTTPS, achieves three improvements over HTTP:
- 1ª. All information travels encrypted between the web server and the clients.
- 2ª. No one can modify the information by intercepting it while it is being sent to its destination.
- 3ª. The domain of the web is authenticated, that is to say, your web page will have a digital signature that will demonstrate to those who want to navigate through it that it is your domain and it really belongs to your company.
We will go into each point further on.
Why should you use it on your website?
Although there are users who do not pay attention to whether or not the website has HTTPS and most of those who do pay attention only know that it is useful for sending encrypted data, that is enough for building trust in your sitewhich is very important in e-commerce or on any web site where the user is asked for personal data.
On the other hand, as I discussed at the beginning, Google said:
Why shouldn’t you use it on your website?
Because it negatively affects the page load time (although not too much), since it has to set the encryption method on the first connection and has to encrypt and decrypt data.
By putting HTTPS, all the URLs of our website will change, which will temporarily affect the positioning, since we will have to make redirections and we will lose the “likes” in social networks. So, if we have a blog and do not collect user data, we may not be compensated for the small improvement in SEO as it may not be noticeable due to the loss of performance.
How does encryption work?
HTTPS uses two encryption algorithms, a public key algorithm called RSA and a private key algorithm (usually AES or 3DES). The first one is used to transmit in encrypted form the key of the second one which is faster to use, but as I don’t want to bore you with explanations about cryptography and discrete mathematics, if you want to go a little deeper you can read this article I wrote about RSA.
The important thing to know is that the public key encryption, which is used to protect the information, is also used to make the digital signature that verifies that your domain belongs to you. If I had to make a comparison in the real world, it is as if you had signed before a notary that your domain belongs to your company and all the users of your website had been present to attest to the act. This “notary” is called Certification Authority (CA), which are companies that sell SSL certificates of digital signature, necessary to have HTTPS, and they are the ones that are going to charge you the most when contracting this service with your hosting provider. Here it can be that the hosting provider offers you certificates of a CA with which it has agreements, or that allows you to put certificates that you have bought directly you to some other CA. In the latter case you would be skipping an intermediary, but this does not imply that, in all cases, it is cheaper for you.
What type of SSL certificate to choose?
There are several classifications for certificates:
According to the number of domains
- For a single domain: as only one certificate per IP is allowed, if we have another domain on the same server and we want to add HTTPS to it, we will have to buy another certificate and another IP.
- Multidomain: with a single certificate we can have HTTPS on several websites hosted on the same server under the same IP.
- Wildcard: these are certificates that can be used for all the subdomains of a domain. For example if we want to have https://midominio.com and https://www.midominio.com, since we intend to redirect https://midominio.com to the subdomain https://www.midominio.com, with a wildcard certificate we will only need one certificate and one IP, while with a single domain certificate we will need two of each.
- A self-signed certificate is one that we sign ourselves ensuring that the web is ours. This type of certificates do not cost anything and are used for testing or for secure access to services that only we are going to use. Browsers display these certificates with a security alert:
Certificates signed by a CA, according to validation type
The type of validation refers to the amount of bureaucracy required to validate that our company really has the domain:
- Domain validation: with this type of validation they do not ask you any questions, they only verify that it is a valid domain. Many certification authorities do not offer this service because it does not entail any type of security, as it passes validation easily. At the other extreme we have the relatively recent CA Let’s Encrypt offering free certificates of this type, this is something hackers are already taking advantage of. Some browsers display these certificates with a gray padlock and others with a green padlock in the navigation bar.
- Business or organizational validation: this type of validation is more exhaustive. You have to prove that you own a company by providing the company name, manager, confirmation that the domain belongs to you by replying to an email that is usually sent to the domain administrator and anything else that the CA asks for. Browsers display this type of certificate with a green padlock:
- Extended validation: in this type of validation they ask for even more information and make more exhaustive checks, such as that the physical address of the company matches the address given for the domain. It is also the most expensive type of validation, so it is usually only used by banks or major institutions. This type of validation is shown with a green padlock and the name of the validated company. In addition, if we click on the padlock we can see more information about the company:
- Number of bits of the RSA public key: the more bits, the more secure the certificate. Google recommends public key certificates of at least 2048 bits.
- Type of hashing algorithm for the signature: an algorithm used to “hash” the signature to make it unreadable. Here it should be noted that SHA1 is deprecated as of January 1, 2017 and the browser may display a warning:
It is recommended that the certificate use some variant of SHA-2 such as SHA-256.
- Type of encryption algorithm for the private key: it can be AES or 3DES (DES is obsolete). AES is recommended because it is faster and at least 128 bits.
- Compatibility with browsers: this is a point that all CAs usually comply with, but it is worth mentioning that they are compatible with 99% of browsers. 100% is impossible as there may be very old browsers with which they are not.
We can find almost any combination of these characteristics in the certificates offered by the certification authorities. For example, we can have a certificate that is multi-domain, wildcard and with extended validation, all at the same time.
Some famous certification authorities include GeoTrust, Verisign, Thawte, Comodo, Symantec, etc. But there are also some resellers offering cheap certificates and the free Let’s Encrypt (their automatic certificate management software is currently not available for Windows servers). So you know, put in Google “SSL certificate” or something similar and you can start comparing certificates and prices that fit your needs and budget. The price for a domain name is usually between 40 € and 300 € per year.
What should be taken into account in terms of SEO in order not to lose traffic during a migration?
- 301 redirectios: all URLs with http must be redirected to their version with https. Considering all the possibilities, for example, for the domain https://www.midominio.com all these redirections would be recommended:
- http://www.midominio.com -> https://www.dominio.com
- https://midominio.com -> https://www.dominio.com (for this redirection it is necessary to have a certificate for mydomain.com, as the redirection occurs after the HTTPS is validated)
- http://midominio.com -> https://www.dominio.com
- Canonical and alternate link tags: all rel=”canonical” and rel=”alternate” tags, for languages and alternate urls on mobile, must have https.
- Sitemap: if we have the possibility to regenerate the sitemaps we should do it.
- Set the main domain with https in Google Search Console.
- Check the robots.txt file to see if we have absolute routes or are filtering traffic by https.
- Update links and paths to images in the code.
If we have some image in the code that loads by http the browser will show us a warning sign above the padlock.
If we have frames (iframes) by http inside a web that works by https, it is probable that the security policy by default of the browser prevents that it is shown.
- Crawl the website with a crawler to see that all links are working properly.
What should the system administrator take into account in order not to lose traffic?
The process of installing a certificate can be complex, especially if you do not have a panel that automates the process and, in addition, depends on the web server. But here I am not going to describe what it consists of, nor how it is done, but to specify the configuration aspects that must be taken into account so that the web does not stop being accessible and that we can check to see if the administrator who has installed the certificate has done his job well:
It is very important that the system administrator remembers to renew the certificate every year or on the corresponding date. It is not uncommon to find customers with their website blocked by a security warning, for a whole day or more, because the administrator did not renew his certificate. The renewal process can take time and must be done in advance of the expiration date. The certification companies will not charge you those extra days for renewing early, on the contrary, many offer discounts.
We can see the expiration date of a certificate by viewing its properties in the advanced settings of the browser.
If we use Let’s Encrypt, the renewal is done automatically by a program.
Installing root and intermediate certificates on the server
Certification authorities also have their digital signature that they use to sign your certificate. These CA certificates are called root certificates. Their public key is pre-installed in all browsers to verify the signature of the certificates of the websites that use them.
There is also another type of certificate called intermediate, which is used for security, to keep the root certificate with the private key offline, since if it becomes public all CA certificates would no longer be valid. When the intermediate certificate exists, the root signs it and this in turn, the certificate of our web page.
This intermediate certificate must exist on the server, because if the browser does not have it, the server transmits it along with the web certificate. It may happen that the administrator forgets to install the intermediate certificate on the server and therefore HTTPS does not work in all browsers. So it is always advisable to test the website on all major desktop and mobile browsers. If it does not work on one of them, the administrator forgot to install the intermediate certificate on the server.
In the advanced configuration of our browser, we can see the certificates that we have installed:
It is also recommended to use the latest version of TLS (currently 1.2). SSL is an older version of TLS.
Finally, when the administrator activates HTTPS on our website, it should work over both HTTP and HTTPS and it is up to the developer to add the redirects.
Verification by means of an online tool
These are the most common failures, but you can check the installation of your certificate in this link.
Can I use HTTP2 at the same time as HTTPS?
HTTP2 is a new version of HTTP 1.1. Since HTTPS is actually two protocols (HTTP and SSL/TLS) and security is provided by SSL/TLS, changing the version of HTTP does not affect what the next protocol does, just as changing IPv4 to IPv6 does not affect the other protocols. This is why layered communication protocols were designed and why all combinations are possible, HTTP2 with HTTPS, HTTP2 without HTTPS, HTTP 1.1 with HTTPS and HTTP 1.1 without HTTPS. However, browsers do not support the combination of HTTP2 without HTTPS.
Is HTTPS really secure?
Yes, it is. The only weak point is the attacks of the MitM (Man in the Middle) type. If we connect to the Internet through an open WiFi connection that a hacker has placed there, the hacker could intercept the communication so that when typing in our browser http://www.facebook.com, the hacker prevents us from being redirected to https://www.facebook.com, capturing all our data (if we do not notice that there is no lock in the URL). To try to avoid this, at least in most cases, the following HSTS header of the HTTP protocol is used:
This header tells the browser that no HTTP requests can be made to the site within one year. The only problem is that we take away the possibility of retracting and returning to have HTTP on our website.
- HTTPS protocol definition at W3C
- Frequently asked questions about the HTTPS secure protocol
- Original HTTP protocol definition
- HTTP2 definition on GitHub
- Frequently asked questions about HTTP2
- Post on how to use Let’s Encrypt to implement a free SSL certificate on your website
I hope it has been helpful and if you have any questions, do not hesitate to ask in the comments. I will try to resolve it as soon as possible.