Written by Ramón Saquete
Table of contents
- 1 What is HTTPS?
- 2 Why you should use it on your website
- 3 Why you should not use it on your website
- 4 How does encryption work?
- 5 What type of SSL certificate should you choose?
- 6 In terms of SEO, what should you keep in mind to avoid losing traffic in a migration?
- 7 What should a system administrator keep in mind to avoid losing traffic when moving to https?
- 8 Can I use HTTP2 and HTTPS at the same time?
- 9 Is HTTPS really secure?
- 10 Additional references
Since Google announced in 2014 that use of HTTPS was going to favour SEO, many websites have started implementing it. Do you know what Hypertext Transfer Protocol Secure really is? Is it really secure for you and your visitors? Have you ever sat down to think whether your website needs it? Do you know where to get it? And, most importantly, do you know how to move your site to HTTPS so that your SEO suffers as little as possible as a result? You will find the answer to all these questions (and many others) if you keep reading this post.
What is HTTPS?
Imagine that there are five people who intervene in the process of sending a letter: the first one writes the content of the letter, and a series of additional data in the header related to this content, such as its size, date, language, etc. We are going to give this person a somewhat “strange” name of Issuing HTTP.
Issuing HTTP passes on this letter to a second person (Issuing SSL or Issuing TLS), who encrypts this content, replacing some letters by others.
Immediately after, SSL or TLS passes on the letter to a third person (Issuing TCP), who, amongst other things, breaks the letter into several pieces and assigns each piece a number, because the postman doesn’t like to carry mail that is too big in size.
Then Issuing TCP goes on to give every piece of this letter to Mr. Issuing IP, who takes each chunk and puts it in a separate envelope, on which he writes the recipient address and the return address.
And finally, Issuing IP delivers each envelope to the fifth and last person, who we call Ethernet, PPP, …, and this person assigns each letter to the closest mailbox, so that the postman takes it to its destination.
Once the letter reaches its destination, it follows a reverse process, meaning, Mr Receiving Ethernet picks up the letters and passes them on to Receiving IP, who takes the pieces out of the envelopes and gives them to Receiving TCP, who reassembles them. After that, Receiving SSL decrypts the content and Receiving HTTP looks at the headers to understand how it must interpret said content.
As you’ve probably noticed, if you replace “person” with “communication protocol”, and “letter” with “HTML document”, you will get an explanation of the HTTPS protocol as a result. HTTP works exactly the same, but removing the SSL/TLS part.
The picture below is a graphic illustration I made of this HTML document journey, where the issuer and the receiver can be the web server and the web browser respectively, and viceversa:
Three aspects which make HTTPS better than HTTP:
- All information travels encrypted between the web server and its clients.
- No one can modify the information by intercepting it when it is sent to its destination.
- The domain of the website is authenticated, meaning that your website will have a digital signature, demonstrating to those who browse it that it is your domain and that it really belongs to your business.
We will explore each of these points further later in this post.
Why you should use it on your website
Even though there are users who don’t look whether a website has HTTPS or not, most of those who do notice it only know that it is useful for sending encrypted data. And this is already enough for them to think your site is more trustworthy, something especially important for e-commerce stores or any other type of website that needs personal information of its users.
On the other hand, as I’ve said at the beginning, Google stated that:
HTTPS is a factor that slightly affects search results rankings, and it’s possible that its importance will increase.
Why you should not use it on your website
Because it negatively affects your website’s download speed (not much, though), as it needs to establish its encryption method with the first connection, and because it needs to encrypt and decrypt data.
Activating HTTPS means that all our website’s URLs are going to change, which will temporarily affect our positioning in search engines, as we’ll need to make redirections, and we will lose all “Likes” in our social media. So, if we have a blog and we don’t collect user data, perhaps this small SEO improvement isn’t worth it, as it may not even be noticeable what with the performance loss.
How does encryption work?
HTTPS uses two encryption algorithms. One is a public-key encryption algorithm called RSA, and the other one is a private-key encryption algorithm (usually AES or 3DES). The first one is used to transfer the second one’s key in an encrypted form, as it is faster to use. But I don’t want to bore you with explanations about cryptography and discreet mathematics.
The important thing you need to know is that public-key encryption, which is used to protect the information, is also used to make digital signatures, which verify that our domain belongs to us. If I had to make a real life comparison, I’d say it’s like signing a document with a notary’s acknowledgement that your domain belongs to your company, and all users of your website were present to witness it. This “notary” entity is called Certification Authority (CA), which are businesses selling SSL digital signature certificates, needed for obtaining HTTPS, and they are going to be the most costly at the time of hiring this service with your hosting provider. It’s also possible that your hosting provider will offer to you certificates that belong to a CA it has an agreement with, or it will allow you to use certificates you bought off another CA. The latter scenario would save you the hassle of dealing with an intermediary, but this doesn’t necessarily mean the process will turn out to be cheaper.
What type of SSL certificate should you choose?
There are several kinds of certificates, which can be classified in the following manner:
Depending on the number of domains
- One domain only: each IP address can have one certificate only. If we have a domain on the same server, and we want to implement HTTPS on it too, we will have to buy another certificate and IP address.
- Multi domain: one certificate can be used on several websites hosted on the same server, under the same IP address.
- Wildcard: these certificates cover all the subdomains of a domain. For example, if we want to have both https://mydomain.com and https://www.mydomain.com, because we want to redirect https://mydomain.com to https://www.mydomain.com, with a wildcard certificate we will only need one certificate and an IP address, while with a one domain only certificate we would need two of each.
Self-signed certificates are those which we sign ourselves, ensuring that our website belong to us. This type of certificates are free and are normally used for testing, or to access in a secure way services that are only going to be used by us. Web browsers display these certificates with a security warning:
Certificates signed by a CA, depending on their validation level
The validation level is based on the amount of bureaucracy that is needed to validate that our business truly owns the domain.
- Domain validation: with this type of validation you are not asked any questions, they only verify that your domain is valid. Many certification authorities do not provide this service, as it doesn’t involve any security measures, and this validation is very easy to obtain. Let’s Encrypt, a relatively recent CA, offers this sort of free certificates, and hackers are already taking advantage of this. Some Internet browsers display these certificates with a grey padlock, and others with a green one, on the URL address bar.
- Business or organisation validation: this validation level is more strict. You must prove that you own the business you claim to own, providing information on the company name and its managing director, and you need to confirm that you are the real proprietor of the domain by answering an e-mail that is sent to the domain admin, and provide anything that the CA requests of you. These certificates are usually identified with a green padlock by browsers.
- Extended validation: for this validation level you’re required to provide even more information, and they perform more thorough checks, for example, whether the business’ physical address is the same as the one that was given for domain registration. It is also a more costly validation type, and for that reason it is mostly used by banks and other important entities. It is displayed with a green padlock and the name of the validated business. Moreover, if we click on the padlock symbol, we can see more information about the business in question:
- Number of bits in the RSA public key: the more bits, the more secure is the certificate. Google recommends public-key certificates of at least 2.048 bits.
- Hash algorithm type for the signature: it’s an algorithm used to cut the signature in “chunks” and make it impossible to read. It’s important to note that SHA1 is obsolete since 1 January 2017 and browsers may display a warning.
It is recommended to use a certificate with some SHA-2 variant, such as SHA-256.
- Type of encryption algorithm for the private key: it can be AES or 3DES (DES is obsolete). It is recommended to use AES because it’s faster, and with 128 bits at the very least.
- Browser compatibility: all CA’s usually comply with this requisite, but it doesn’t hurt if they explicitly state that their certificates are compatible with 99% of browsers. It’s impossible to be 100% compatible, though, as there may be very old browsers where they don’t work.
We can find almost any combination of these features in certificates provided by Certification Authorities. For example, we can have a multi domain certificate, wildcard certificate and extended validation certificate, all in one.
Some well-known certification authorities are GeoTrust, Verisign, Thawte, Comodo, Symantec, etc. There are also certain resellers who offer cheaper certificates, and there’s also Let’s Encrypt, whose certificates are free (although currently their certificate management service isn’t available to Windows users). So, you know, just enter “SSL certificates” or something along those lines into Google and start comparing certificates and prices, to find the option that best suits your needs and budget. To give you an idea, the price for one domain usually ranges between €40 and €300 per year.
In terms of SEO, what should you keep in mind to avoid losing traffic in a migration?
- 301 redirects: you must redirect all http URLs to their https version. For example, if our domain is https://www.mydomain.com/, and we take into account all possibilities, we recommend making the following redirections:
- http://www.mydomain.com/ to https://www.mydomain.com/
- https://mydomain.com to https://www.mydomain.com/ (you must have a certificate for mydomain.com, as this redirection occurs after HTTPS has been validated)
- http://mydomain.com/ to https://www.mydomain.com/
- Canonical link elements and alternate link elements: all rel=”canonical” and rel=”alternate”, for languages and alternate mobile versions must be changed to https.
- Sitemap: if it’s possible to regenerate sitemap files, we must absolutely do so.
- In Google Search Console, we must change the main domain to the https version.
- We must check the robots.txt file and edit it if there are absolute URL addresses, or if we’re filtering traffic by https.
- We must update links and URLs addresses pointing to images in our code. If there’s an image in our code that loads through http, the browser will display a warning over the padlock symbol. If we have iframes loading through https, it is possible that due to default security policy, the browser will not display it.
- We must crawl our entire website with a crawler tool, to make sure that all links are in working order.
What should a system administrator keep in mind to avoid losing traffic when moving to https?
Installing a certificate can be a complex operation, especially if we don’t have a panel to automate this process. Moreover, it also depends on the web server. I’m not going to get into what it is or how it’s done, but I am going to specify the configuration aspects you must keep in mind to avoid your website from becoming unavailable, and how to check that the system administrator who installed the certificate have done their job:
It is extremely important that the system administrator remembers to renew the certificate every year, or whenever it is supposed to be renewed. It is not a rare occurrence for us to run into clients whose websites got blocked with a security warning for more than a day, because their admin forgot to renew the certificate. The renewal process takes time, so it must be done before the expiration date. Certification authorities won’t charge you for renewing your certificate in advance, on the contrary, many of them reward you with discounts.
You can check a certificate’s expiration date by looking at its properties in advanced settings of the browser.
If we use Let’s Encrypt, however, the renewal is carried out automatically by their software.
Installation of root and intermediate certificates on the server
Certification authorities also have their own digital signature that they use to sign your certificate. These CA certificates are called root certificates. Their public key comes pre-installed in all browsers to verify the certificate signature on the website using them.
There is also another type of certificate, called intermediate, which is used for security purposes, keeping root certificates with private key offline, because if they became public they would stop being valid. When there’s an intermediate certificate, the root certificate signs it, and the intermediate one, in turn, signs the certificate of our website.
This intermediate certificate must exist on the server because if the browser doesn’t have it, the server transmits it together with the website’s certificate. Sometimes the admin can forget to install the intermediate certificate on the server, which stops HTTPS from working in all browsers. For that reason, it is recommended to check the website on all the most-used browsers, both on mobile and desktop devices, just to make sure. If HTTPS doesn’t work on one of the browsers, it means the administrator forgot to install the intermediate certificate on the server.
We can see what certificates we have installed by accessing the advanced configuration of our browser:
It is also recommended to use the latest TLS version (currently, it is 1.2). SSL is an older version of TLS.
Finally, when the admin activates HTTPS on our website, it should work through HTTP and HTTPS, and it’s the developer who should be in charge of adding all necessary redirections.
Online tool to detect issues
These are the most common mistakes that one can make during the implementation, but you can also visit this link to check that your certificate’s installation was done correctly.
Can I use HTTP2 and HTTPS at the same time?
HTTP2 is the new version of HTTP 1.1. Considering that HTTPS is actually composed of two protocols (HTTP and SSL/TLS) and security is provided by SSL/TLS, changing the HTTP version won’t affect what the next protocol does, in the same way that changing IPv4 to IPv6 doesn’t affect all other protocols. It is precisely for this reason that communication protocols were designed by layers, meaning that all combinations are possible: HTTP2 with HTTPS, HTTP2 without HTTPS, HTTP 1.1 with HTTPS, and HTTP 1.1 without HTTPS. However, browsers do not support HTTP2 without HTTPS combination.
Is HTTPS really secure?
Yes, it is. Its only weak point are MitM (Man in the Middle) attacks. If we connect to the Internet through an open Wi-Fi connection created by a hacker, this person could intercept the communication in a way that when we type http://www.facebook.com in our browser, the hacker could stop the redirection to https://www.facebook.com, and capture all our data (if we don’t realise that there isn’t a padlock in the address bar). To avoid this –at least in most cases– the following HSTS header of the HTTP protocol is used:
This header tells the browser that HTTP requests cannot be made to the website for a year. This only presents one problem: we cannot go back and use HTTP on our website.
- HTTPS protocol definition on W3C
- Frequently asked questions about HTTPS
- Original definition of HTTP
- HTTP2 definition on GitHub
- Frequently asked questions about HTTP2
I hope this post was helpful to you, and if you have any questions, do not hesitate to leave a comment. I will try to reply to you as soon as possible.