Moving WordPress Images To Google Storage

Summary

In my article Running Google App Engine Behind Cloudflare, the goal is to get to a point where horizontal scaling can happen. One of the final barriers is the location of the images. WordPress stores the images in wp-content on the local machine.

Horizontal Scaling

Once this is separated the WordPress site can be somewhat easily horizontally scaled. There are a few methods that can be used to achieve this, particularly in Google Cloud.

  1. We could spin up more VMs and point them to the database
  2. Load the WordPress Docker image into a Kubernetes Cluster
  3. Run WordPress in Google App Engine

The actual method of horizontal scaling is out of scope for this document but this is the last barrier to get you to that decision.

References

I am going to give credit where it is due up front. Google’s tutorial on running WordPress on Google App Engine was a good starting point but not the first article I came across on this.

I came across this article from Kinsta which has some pretty good directions on a tool that looks extremely promising.

Storage Plugins

With anything WordPress, there is a plugin for it! Here are some options. We will choose one of these for this article.

  1. Google Cloud Storage Plugin – I have not seen much on this one.
  2. WP Offload Media – This one seems to have been around the longest but it will cost you to migrate existing content
  3. WP-Stateless – Seems extremely promising. This shows up 3rd on the list but is the one we will implement in this article.

Preparation

For this article I decided it probably is not a good idea to make intrusive changes to this blog in order to generate more content. For this reason I decided to clone the production into staging.

Also, make sure to kick off a snapshot and backup of the database and VM beforehand.

Install WP-Stateless

We need to install and activate the WP-Stateless plugin.

Create Storage Bucket

WP-Stateless will guide you through creating the bucket but I wanted to do that manually to walk through the options. In the Storage / Browser section click “Create Bucket”. I had every intention of using a custom domain name but it does not appears to be supported according to Google.

With Cloudflare since I have it set to strict, it is expecting an origin cert. If I were to downgrade the site to flexible it would connect over 80 and likely work. This is a little bit concerning because Cloudflare’s CDN is great and this now bypasses it.

Note: You can use a CNAME redirect only with HTTP, not with HTTPS. To serve your content through a custom domain over SSL, you can set up a load balancer.”

https://cloud.google.com/storage/docs/request-endpoints#cname
Name Bucket - use fully qualified domain name to help future proof but not required.
Name Bucket – use fully qualified domain name to help future proof but not required.

This is my lab so I chose the cheapest option but since we cannot use behind a secure CloudFlare, you may opt for multi-region. Google Storage is not a full CDN but there are tools you can layer on top of it for that.

Select Region
Select Region

Next we will set up the ACLs. This is a mistake I made during the initial setup assuming there were no ACLs. The result was that the sync would claim to work but nothing would actually happen. I left this to show the config mistake.

Select Fine-grained
Select Fine-grained

Configure WP-Stateless

It then runs you through a nice wizard. You can do a manual run as well. If you go that route, WP-Stateless’s instructions are fairly complete.

the first step asks you to login and it generates a json file for you so that it can authenticate.

Select your Project and Bucket
Select your Project and Bucket

In the configuration we need to set a few options. Namely stateless.

Settings / General / Stateless
Settings / General / Stateless

Due to the SSL issue we will leave domain blank. There is currently nowhere to upload the origin cert so a CNAME uses the A record’s SSL cert and would cause a cert mismatch. Being in strict mode, this won’t work but it may work in “full” or “flexible” since Cloudflare does not validate the cert in one case or use it in the other. If you are in “Full” or lower, give it a shot though!

Run a sync and you’re off!

Initially I was running on a micro instance with under 1GB RAM and it locked up and ran out of RAM. The default Bulk size is 1, you may need to go closer to 1. I re-ran this on a 1.7GB instance and ran with 1 and had no issues.

On the VM itself I validated images were removed.

$ find /var/www/html/wp-content/uploads/ | wc -l
441

$ find /var/www/html/wp-content/uploads/ | wc -l
42

There are still 42 images. We’ll track that down!

Some of these images did not have proper permissions. Since I manually synced the filesystems for this staging environment some of the newer images had incorrect permissions.

$ find /var/www/html/wp-content/uploads/ -ls
   131221      4 drwxr-xr-x   3 www-data www-data     4096 Oct 25 19:22 /var/www/html/wp-content/uploads/
   131223      4 drwxr-xr-x   4 www-data www-data     4096 Nov  1 08:38 /var/www/html/wp-content/uploads/2019
   131224      4 drwxr-xr-x   2 www-data www-data     4096 Nov  8 14:40 /var/www/html/wp-content/uploads/2019/10
   131891     12 -rw-r--r--   1 www-data www-data     9512 Nov  8 03:46 /var/www/html/wp-content/uploads/2019/10/http_1_1-100x100.png
   131849      4 -rw-r--r--   1 www-data www-data     2835 Nov  8 03:46 /var/www/html/wp-content/uploads/2019/10/logo-100x100.jpg
   131866     12 -rw-r--r--   1 www-data www-data     9200 Nov  8 03:46 /var/www/html/wp-content/uploads/2019/10/K2167-100x100.png
   131682      4 -rw-r--r--   1 www-data www-data     3324 Nov  8 03:46 /var/www/html/wp-content/uploads/2019/10/wh_header-100x100.jpg
   131229     20 drwxr-xr-x   2 www-data www-data    20480 Nov  8 14:40 /var/www/html/wp-content/uploads/2019/11
   147546     12 -rw-r--r--   1 root     root         8442 Nov  8 03:46 /var/www/html/wp-content/uploads/2019/11/GAE-Default-768x117.jpg
   147574      4 -rw-r--r--   1 root     root         3173 Nov  8 03:46 /var/www/html/wp-content/uploads/2019/11/GAE-Cert-300x52.jpg
   147550      8 -rw-r--r--   1 root     root         7206 Nov  8 03:46 /var/www/html/wp-content/uploads/2019/11/CF-DNS-Only-1024x48.jpg
   136683      8 -rw-r--r--   1 root     root         6938 Nov  8 03:46 /var/www/html/wp-content/uploads/2019/11/SSL-CF-Origin-300x140.jpg
   131918      4 -rw-r--r--   1 root     root         1604 Nov  8 03:46 /var/www/html/wp-content/uploads/2019/11/CF-DNS-Only-150x49.jpg
   147575     32 -rw-r--r--   1 root     root        31759 Nov  8 03:46 /var/www/html/wp-content/uploads/2019/11/mysql-purge-1024x186.jpg

Chown to the rescue

$ chown -R www-data:www-data /var/www/html/wp-content/uploads/*

Now we’re down to 8 after running again and these are actually unused images.

$ find /var/www/html/wp-content/uploads/ | wc -l
8

Final Words

Here we have used a free plugin to move our images to a shared and central repository. Due to my configuration and desire to keep it, it does not leverage Cloudflare’s CDN but you are able to make your own decision on that.

UPDATE: 20191109 – I can confirm lowering Cloudflare security to Full and adding the CNAME to c.storage.googleapis.com does allow this to work. It would be a decision point at the time of needing this whether I go that route.

At this point, at least in this test environment, I could spin up multiple front ends now to handle any excess of traffic.

Another benefit of this is it helps keep your VM light without having to store all of your images on it.

Author: David Chapman

I started my IT career in the very late 1990's as a systems administrator for a benefits company. I have always enjoyed learning new technologies and helping people fix problems which has lead me through a few different positions over the past 20+ years.