We run a number of LAMP servers on AWS with a few dozen websites on them, that customers pay us to design, build and host. They're Ubuntu 14.04 servers with Varnish, Apache and PHP.
Currently, if a customer wanted to have SSL/TLS for their website, we'd put an Amazon ELB load-balancer in front of the server to offload the TLS-connection so Varnish could still cache the content. So, each server was eventually fronted by a half dozen ELBs (one for each TLS customer or site) while Non-TLS sites are handled directly by the server.
To reduce cost and make setup easier we want to eliminate all the ELBs and terminate all TLS connections directly on the servers. This can easily be achieved by running a reverse-proxy in front of Varnish, with Let's Encrypt and SNI. Something like Hitch, Traefik or Nginx.
Some sites are not ready for TLS yet. They require work to fix mixed-content warnings and prevent a SEO drop and not all customers have the budget.
I can open up port 443 on a server and run a reverse-proxy with TLS certificates installed for all 'ready' sites. Clients will unfortunately still be able to connect to 'not ready' sites, though they will get certificate errors (Common Name mismatch, self-signed, etc.). We don't intend to link to the HTTPS versions of 'not ready' sites of course, but someone could still type https://
.
I want to prevent a loss in SEO ranking for all sites, mainly in Google. I've been warned that Googlebot will discover the HTTPS-version of 'not ready' sites and index them despite the certificate errors and despite them not being advertised as HTTPS. This would lead to a horrible experience for visitors of those links as well as a serious ranking loss. SEO ranking are hard to gain but easy to lose.
How does Googlebot (and perhaps similar bots) deal with HTTPS versions of 'not ready' sites? Will they be indexed, despite being broken and not being advertised?
How do you mitigate unwanted side effects when partially enabling HTTPS via SNI?
0 Answers