November 2019 - Page 2 of 3 - Woohoo Services Blog!

New Active Directory Domain with Windows Server 2016

Summary

The purpose of this article is to stand up a new Active Directory Domain with server 2016. The starting point for most people would be that they think they need a domain but have no domain controller. Usually it is just a few servers and workstations setup in a Workgroup.

Planning Your New Active Directory Domain

As with many projects, planning should be the majority of the work. If you plan properly, the execution is likely only 10% of the project.

A Domain Controller will be the DNS servers for all servers and workstations in the Domain. There are quite a few other roles that become a single point of failure with one Domain Controller. Typically a Domain will have at least two Domain Controllers for this reason.

Be prepared to modify the workstation and server DNS servers to use the new Domain Controller. If you use DHCP, we will need to modify the appropriate DHCP scope. If you do not, we can use the Domain Controller for that to ease the IP configuration management of those workstations.

I chose Server 2016 for this article because 2019 was just released and I prefer “tried and true” operating systems versus bleeding edge. Let others test out 2019 until quite a few patches have been released.

Requirements

Windows Server 2016 surprisingly has low requirements. It only requires 1 core and 512MB of RAM. Domain Controllers do not require a ton of RAM but you do want to ensure your entire Active Directory can be cached in memory. For a production Active Directory Domain that is relatively small, 1-2 core and 4GB of RAM is usually sufficient. I define relatively small as only a few hundred users. Monitoring RAM and CPU though will help you tweak this.

For the sake of this lab we will allocate 1 core, 1 gb ram and 20 25 gb hdd. During the course of this lab though I did increase the ram to 1.5 gb. During Windows Updates, it started to get low and complain.

Installing Windows 2016

When booting off the ISO or CD/DVD, it will prompt you to press any key. This is to help ensure that if Windows is already installed and the media is left attached, it will not automatically go through the install process again.

It will then prompt you for a few self-explanatory questions such as language/locale and then ask you to click “Install”.

Editions

In the Windows 2008 days, there used to be Standard, Enterprise and Datacenter. Standard was the typical install. Enterprise was required for clustering servers, large amount of ram installed and a few other scenarios. Datacenter was essentially enterprise with unlimited virtualization rights and a few other limitations removed. In recent versions, many of the enterprise features were rolled into standard edition leaving us those two.

Aside from that, previously there was Server and Server Core which appears to be renamed. Core is just a shell installed without a GUI. When Windows boots up, you simply get a command prompt. For this we will install the Desktop Experience in Standard. Details about this can be further read in https://docs.microsoft.com/en-us/windows-server/get-started/getting-started-with-server-with-desktop-experience

Windows Server 2016 Standard (Desktop Experience)

Setup Install Steps

Accept the license terms on the next screen! Then choose “Custom” as we are not upgrading.

Next we will select which drive to install on. There is only one option on our server so we just click next.

Now we are off to the races. This next step can take a while depending on the specifications of your server.

Windows Setup - Install Status — Windows Setup – Install Status

It will then automatically reboot when done if there are no issues. This will lead you into a screen to set the Administrator password. Remember this password. After we promote this Active Directory Domain Controller, it will become the Domain “Administrator” password.

Customize settings - Administrator - Password — Customize settings – Administrator – Password

We now have a Windows 2016 Standard Server installed and ready to login.

Running Windows Updates

Every new install I do, I like to install windows updates first. This helps ensure the security of the server before bringing it into production. It also fixes any known bugs on the RTM release installed.

In Server Manager under Local Server / Windows Update. Click the boxed in area.

Server Manager / Local Server / Windows Update

Next, Check for Updates! On a fresh install this can take some time, 20-30 minutes. The update process has to do a full inventory of installed updates, even-though this is a fresh install.

Windows Update Status - Updates are available. — Windows Update Status – Updates are available.

After downloading, it will naturally go through the installation which is usually a bit longer than the download.

In previous versions of Windows it may take a few iterations of updates to get fully updated. After rebooting from Windows Updates, check again to make sure this is not the case. More recent versions of Windows have been better about this.

Naming the Server

Before we get too far along, we want to choose a name for the server. Once promoting it to a Domain Controller it is difficult to change. Newer versions of Windows are better at handling this but it is less than ideal. For this lab we left the server name alone. One could argue that doing so is more secure because it is random. That said, it would be a pain to manage as you could not remember which server was where.

The name change can happen in Server Manager by clicking on the existing server name and following the prompts. It will require a reboot.

Change Name if desired - Most likely yes in production! — Change Name if desired – Most likely yes in production!

You may have caught the woohoosvcs.local in the name. It wouldn’t be here at this point but I decided to throw this in after promoting it.

Installing Active Directory Domain Services

From Server Manager choose Add Roles from the Manage Menu.

*Server Manager Add Roles and Features Wizard*

We will then select a “Role-based or feature-based installation”. The Server Manager has the ability to mange remote servers but we will choose the singular local server. We will select “Active Directory Domain Services”

The defaults are acceptable, particularly if you want the management tools installed. Hint: You usually do unless you know you do not!

Active Directory Domain Services features

We then “next next next” though the other options and click “Install”. At the end of this we need to reboot. On the “Install” screen there is a checkbox to automatically reboot if you choose.

Finally, we have Active Directory Domain Services installed. We have not enabled them but they are there!

Promoting an Active Directory Domain Controller

Now that we have Active Directory installed, we can “promote” it. A Domain Controller before it is a Domain Controller is just a server. We then promote it to a Domain Controller. The command to do this used to be “dcpromo.exe”. It may still work although I believe it is deprecated in this version.

Promotion Wizard

In Server Manager, click the flag with the yellow warning. It is letting us know we installed Active Directory Domain Services but never promoted it. Then click “Promote this server to a domain controller”

Promote this server to a domain controller

By default it wants you to add a domain controller to an existing domain. This is the most common use case as not everyone is standing up new domains on a regular basis.

Add a new forest for Active Directory Domain Services

I believe Microsoft is moving away from the “.local” prefix for domains but in order to avoid a split brained DNS scenario I created a unique root domain name.

Promotion Roles and Options

The next set of questions requires some thought.

We definitely want DNS enabled. Global Catalog is required because you need at least one per domain. The DSRM password is for when you need to boot up into active directory services restore mode to help recover the Active Directory. You will notice I skipped over the functional levels.

Functional levels serve as least common denominators. A domain functional level of “Windows Server 2016” means that all domain controllers are at least at that level and you cannot promote any domain controllers from earlier versions. This allows new 2016 features to be enabled for the domain. When all of the domains are at a functional level, the forest can be increased as well.

The lowest functional level you can set with 2016 is 2008. If you do not know why you would do this, it’s best to leave this one alone. Particularly if your machines are relatively new in the past 5 years.

This error for DNS Delegation is normal, particularly since we have no Active Directory DNS yet!

A delegation for this DNS server cannot be created because the authoritative parent zone cannot be found.

Moving Along Our Active Directory Domain Configuration

For this document, you can accept the defaults for the next few screens but make note of the settings. On the review screen you can actually copy these and paste them into a text file somewhere.

An interesting note is that this wizard creates a configuration file that notes the changes as a playbook. Below we will walk through the prerequisite checks.

Active Directory Domain Controller Promotion - Prerequisites Check — Active Directory Domain Controller Promotion – Prerequisites Check

These are all normal. On a production server, the second error is correct and valid for production. In that scenario we would allocate a static IP to ensure the server keeps its up. This is particularly so, since DHCP services tend to be loaded onto a Domain Controller.

Please click install at this point. Wait a few minutes and if you are not paying attention it will automatically reboot. This is fine, you just will not see the success screen.

The next time you login to the Domain Controller you will be doing so using Active Directory Domain Services! The first boot may take a while as Group Policies and settings are applying.

Joining a Machine to Active Directory Domain Services

What good would this document be if we left it there? We need proof it actually works!

Active Directory Domain Controller IP Address

On the domain controller, let’s get it’s ip address via ipconfig command.

C:\Windows\system32>ipconfig
.....
   IPv4 Address. . . . . . . . . . . : 192.168.116.183
.....

We need this to set the Windows 8.1 Workstation’s DNS settings. The workstations use DNS to navigate Active Directory and find the resources it needs. Typically this is done via DNS SRV records.

Workstation DNS IP

On our Windows Workstation, go to the Control Panel / Network and let’s modify the DNS settings for the interface.

A common mistake is to add an alternate server that is a non domain controller such as Google’s 8.8.8.8. Doing so may appear to work but at some point the workstation will try to query active directory using Google and it will not quite work right.

Windows 8 - Change Settings — Windows 8 – Change Settings

Select “Domain” and type in the domain name you chose, in this case it was “woohoosvcs.local”

Domain - woohoosvcs.local — Domain – woohoosvcs.local

A nice tidbit is anyone can join machines to the domain but in the past regular users had a limit on the number that they could. I do not know if this is still valid but best to use a Domain Admin which “Administrator” is.

Welcome to woohoosvcs.local - Woohoo! — Welcome to woohoosvcs.local – Woohoo!

You will need to reboot and then the machine is joined to the domain. At this point there is only one domain user “Administrator”.

Final Words

This document walked us through installing Windows Server 2016. We then installed Active Directory Domain Servers and promoted the server to Domain Controller. Finally we joined a workstation to the domain.

Packet Capture – Introduction to Wireshark

Summary

This guide is designed for someone that has never performed a packet capture before or may have had to a few times but really did not understand it. Many times the packet capture can seem like a needle in a haystack.

Brief History

Wireshark has been around for quite some time. In 1998 it was called ethereal but had to change its name. You can find a full history of that on their Wikipedia page – https://en.wikipedia.org/wiki/Wireshark

Installing Wireshark

Wireshark can be downloaded from https://www.wireshark.org/download.html

On the installation, most of the defaults should work. On the machine you want to perform the capture on, make sure winpcap or now npcap are installed. That is what allows the packets to actually get captured on windows. UNIX like operating systems already come with the necessary libraries.

Capture Packets!

Here we will filter based on port 443 as we intend to make a connection to https://blog.woohoosvcs.com on port 443. First we need to select the adapter though. If you’re unsure of which one, you can see the traffic graph (squiggly lines). If you type the capture filter first and then change the adapter the filter will clear.

Filter based on port 443 for HTTPS and on the Wi-Fi adapter.

We want to limit the capture as much as possible because there will be a lot of traffic without a filter. Be careful though as filtering too much can lead to not capturing the intended packets.

Next I will curl or make a connection to https://blog.woohoosvcs.com

I am using curl -v so that I can see the IP address. This is an ipv6 address.

% curl -v https://blog.woohoosvcs.com0*   Trying 2606:4700:20::681a:d78...
* TCP_NODELAY set
* Connected to blog.woohoosvcs.com (2606:4700:20::681a:d78) port 443 (#0)

After running curl we want to click the red square button in wireshark to stop the capture. These can grow rather large.

We can then set a display filter to the ip address 2606:4700:20::681a:d78

display filter of packet capture - ipv6.addr == 2606:4700:20::681a:d78 — ipv6.addr == 2606:4700:20::681a:d78

This helps us narrow down to just the packets necessary that we want to analyze. The capture filter restricts the packets that wireshark even sees coming from pcap. The display filter does just that. It filters what you are displaying but all the other packets it captured are still there.

Analyzing Packet

TCP Handshake

The first step to any TCP connection like HTTPS is a 3-way handshake. In TCP, it is a stateful connection protocol. It uses flags or options to help keep track of the connection. When making a new connection from my machine (A) to blog.woohoosvcs.com (B) the handshake looks roughly like this

A – > B (flag: SYN) – A is telling B it wants to make a new connection.
B -> A (flag: SYN+ACK) – B is telling A it “ACKnowledges” the original SYN and agrees to the connection with its own SYN
A -> B (flag: SYN) – A is telling B it ACKs the SYN from B

Visually we see that here

Many instances if the remote end does not want to accept your packet, it simply will not respond. You may see your SYN sent and nothing in return and SYN retries happening. Other times if it forcefully wants to deny the connection instead of SYN+ACK you will get an RST or RST+ACK in the response from the remote end.

Now that the connection is open and established, we can inspect the TLS handshake.

TLS Handshake

In the above, right below the 3 way handshake we can see a TLS “Client Hello”. This is similar to the 3 way handshake except for TLS. The client, in this case “curl” is trying to negotiate compatible methods of communication.

We cannot actually see the packets captured but we can see metadata about the TLS connection. The higlighted areas above may be confusing. Why is it announcing two different TLS versions. At the record level it is announcing TLS 1.0. This is the lowest version the client is indicating it supports. At the client hello envelope it is announcing TLS 1.2 which is the highest it supports. This tells the server anywhere between TLS 1.0 and 1.2.

Looking at the screenshot you can see other proposed settings that client is recommending/offering. The other main one are the cipher suites. What encryption methods does the client support.

The client is much like a catcher in baseball. It proposes the pitches or connection parameters. The server or pitcher just says yes or no.

Here we can see the TLS 1.2 protocol was selected and a singular cipher suite. Many times the Server Hello is where it fails if the server requires a TLS version the client does not support or a cipher suite the client does not support. On this blog we have it set to require tls 1.2 or higher.

Forcing to Fail TLS Handshake

I will instruct curl to connect with TLS 1.0 as the max

% curl --tls-max 1.0 https://blog.woohoosvcs.com       
curl: (35) error:1400442E:SSL routines:CONNECT_CR_SRVR_HELLO:tlsv1 alert protocol version

As you can see, the client complains but what does the capture look like?

TLS Handshake error - Fatal - Protocol Version — TLS Handshake error – Fatal – Protocol Version

In the capture above you can see the client hello. It specifies a max of TLS 1.0. In the 3rd packet you can see the server responding with an Alert instead of Server Hello and the 4th packet the server is actually closing the connection due to the protocol negotiation issue.

Final Words

In this article we learned how to run Wireshark and capture packets. From there we learned how to investigate the TCP 3-way handshake and a TLS negotiation. In real world scenarios, this will at least help you weed out these two types of issues.

How to Organize Your Day

Summary

A few years ago, I had quite a few employees under me. They would always marvel at how I could keep track of the few hundred emails a day. I showed them how to organize them as described in “Inbox Zero“. Many of them took to it, others liked it but could not drive themselves to do it.

That about sums up my management style. I am always happy to show you what works for me. Not with the intent of forcing that methodology onto anyone. I do it to provide another way of doing things so that anyone can take that and mix it in with what works for them.

In this article, I aim to walk you through my processes. Take what you want from them and help make them your own. Write back and tell me how you have implemented them so others can learn too!

I do not think it would be appropriate to discuss organization without mentioning The 7 Habits of Highly Effective People. If you have the time, I highly recommend ordering a copy and reading it.

Keys

Some of the key themes are to set a time limit, use tools to help you organize/track and be intentional when possible.

First Thing

I usually wake up between 5-6AM. This may seem terribly early for some and late for others. We have two very active children so having some quiet time first thing in the morning is a hot commodity.

I read somewhere, quite a long time ago that the most successful people start out their day early. They are up at 5-6ish, catching relevant news to their job/business/industry, even if it is just playing in the background. It seems to have worked well for me. Wake up late and you are just playing catch up all day.

For me, I wake up to a nice cup of coffee and slowly wake my brain for the day. I follow a few subreddits on reddit. I typically limit this to 10-15 mins.

Sometimes I will have a small project or some emails that came in from overnight that I will start plugging into for about 30 mins.

The important part of this activity for myself is to lightly plug into things. It is just a warm up for the day. Otherwise, I’d just be a workaholic telling you to wake up at 5AM and start working all day!

Schedule

My schedule is very routine. I find comfort in this and it helps me stay organized. With that said, your schedule needs to be flexible enough for things that may derail it. Do not try to plan ever minute. Work “fires” happen, projects land on my desk unexpectedly. The framework I have setup for my day does not usually change though.

For example, when I arrive at the office, my first tasks are to check emails and log into our ticketing system. I also drop a line on any work chat applications to let everyone know I’m in and in front of a desk incase they need me.

I then look through my tasks and organize them based on priority. From 7 Habits of Highly Effective People, “First things, first!”. Many times things are marked “urgent” but they are not important. Those do not need your attention. Work on things that are important and urgent and then just important. Many times “urgent” issues that are not really urgent get cold fairly quickly.

Take Breaks

Taking breaks is extremely helpful. Many times when I have been spinning my wheels on difficult projects, I ultimately put it down for a bit. When I came back, I was refreshed and many times had a solution to the problem. This also works with smaller tasks. When you are able to focus, do so with intensity. Do not expect to be able to maintain that intensity for long periods of time.

Switch It Up

If you find that your returns are diminishing on a task and it is starting to slow down, start a new task. This pairs up with take a break. Many times we just need to switch away from the current task to get a breather. Finding a new task that we can meet with enthusiasm helps. Returning client calls for 20 mins and worn out? Switch over to catching up on industry/career news for a few minutes. Again, be intention and set limits. Don’t let that 5 minute switch over lead you to seven layers deep in a wikipedia article (yes I have been there!). This helps organize your time and make it efficient.

Winding Down

At the end of the day, I try to wind down a good 30-45 mins before the steam bell rings. I realize this is not always possible as some jobs are pedal to the metal from clock in to clock out. Even then clock out is a blurry line. If you can though, wind down 30-45 mins prior. Many times, trying to crank out work until the last minute causes you to run over and deter you from obligations after work. Again, be intentional! If you intend to put in another 1-2 hours for the day at work, go ahead. If you want to try to leave right on time, give this a try.

Manage Your Manager

If you know what your manager needs, try to get it to them before they ask. No need to wait until they ask, assuming you have the time to pre-empt their need. If they drop a huge project on your lap but you are already working on a large project, ask them about expectations. Something like “I can certainly do this but I am already working on X. How would you rank the importance? Is it ok to complete it 2 days from now?” You will find being genuine goes a long way. Many times your interaction with your manager and the workload they give you dictates your day.

Final Words

We made it to the bottom and hopefully you have learned a few things to help organize your day. What helps you organize your day?

Inbox Zero – How To Organize Your Inbox

Summary

About 10 years ago my email inbox was out of control and I though there had to be a better way. I had all sorts of rules “working for me” but it all seemed disorganized. Early in my career, I have also had the pleasure of doing desktop support for Executives. A common complaint was how terribly their email client performed. They would have tens of thousands of emails in their inbox. I then came across Inbox Zero.

What is Inbox Zero?

The basic premise is to keep your inbox count at zero or close to it. Use your inbox as a basic todo of things that need immediate response. Move or archive emails from your inbox that no longer require your attention.

Reminders are extremely helpful. Tools like Outlook allow you to mark emails for follow up with reminders while gmail has the snooze feature. The old school method is also to tag emails with a category and use a calendar item to remind you to follow up.

How Can I Achieve Inbox Zero?

Starting Out

The easiest way to start is to move all of your inbox into a new folder. If you use Outlook connected to either Exchange or IMAP, there is a good reason to split your folders to keep them under 10,000 emails. This is mainly for performance issues. Something like Gmail or G-Suite, simply apply a label and/or archive them.

Rules that auto move emails into folders are typically a bad idea. Disable or delete all of the rules you have. You want to process every piece of email that comes through with few exceptions.

Maintaining

This is the difficult part of the task. It requires a high level of commitment. Here is a high level thought process.

Is the email some sort of automated process that you just need to keep but not look at? This is one of the few types that I archive. For these I archive automatically if they indicate success. If they are error or warning they go into my inbox
Can I respond immediately to it? If so, respond and archive.
If the email is not something you can respond to timely, move to a folder and mark for follow up or use the snooze feature in gmail.

Automated emails that you do not even need to process should not require your attention. Many times we need them or have no choice in receiving them but may need to refer to them later on. These are ideal candidates for rules to auto archive.

One of the tenants of Inbox Zero is to be responsive. If you have an email but cannot properly respond, try replying that you received it but set a time when you can. Something as simple as “I received this but I am in meetings all day today and cannot fully respond until tomorrow”. This provides them with a response and an expectation of when to get an answer.

Keep in mind the goal is to get down to zero. Some days it happens, other it does not. Do not stress about not being able to get to zero. Do focus on continually trying to get there though.

Timing

Since this most likely requires more attention than you are used to, allocating time is important. Do not feel like you need to jump to your inbox every time a new message comes in. Wait for a few or only check it 15-30 minutes. The frequency of combing through your inbox is determined by your ability to context switch.

Context switching is the ability to stop one task and immediately start up another. This typically takes some level of effort. That level of effort is dependent on the type of task you were switching from. If you are very deep on a project, it may be difficult to “come up for air” and get in the right frame of mind to switch. It may not be worth it to try to check email often. On the other-hand if you are doing light tasks you may be able to check email more often without much difficulty.

Inbox Zero Sounds Like More Work?

Just like the title says, this sounds like a ton of work. What is the benefit? The benefit is never having someone come up to you asking you if you got the email or could respond. It may seem like more work up front but over time it becomes second nature.

Final Words

My philosophy in management and knowledge sharing is to share and exchange ideas. My way may not be the best but I hope that you can take my method and adapt it to your needs. If you have your own method, before this or derived from it, feel free to share it!

Hello World From Google App Engine via PHP

Summary

This article builds upon the Running Google App Engine Behind Cloudflare article. Based on that article we have a default service for static content. In Google App Engine, there is only one instance of GAE per project. We can however have multiple services. All of the above is fairly trivial but this article is nuanced with using custom domains.

The Google Docs have various automatic routing that is supposed to work but does not quite seem to work properly. In all fairness, the documentation indicates custom domains only work on http not https. With that said, I could not even get it to work properly without configuring a dispatch.

Configuring Custom Domain

From the last article, we configured www-test.woohoosvcs.com previously but we will walk through it again.

We want to “Add” a custom domain. Walk through the normal setup and in this case it was “www-test.woohoosvcs.com”. DNS records were already in place but it lists the CNAME that should be put in place.

This time we were able to select the domain name and click “disable managed security” to stop that process.

Then flip over to SSL certificates and assign the wildcard to this domain. If you get tired of doing this you can assign a “*.woohoosvcs.com” domain but don’t delete the “wooohoosvcs.com” if you do.

Google App Engine - Custom domains — Google App Engine – Custom domains

Service Routing

Google has quite a few documents I will list below on how it “should work”. And maybe I just had bad luck. I played with it for a good hour and came to the conclusion that I had to use a dispatch for custom domains.

Here are a few articles I went through.

Setting up the Service

Please keep in mind, you must already have a default service to create a non default. We have done that in the previous article.

www-test is fairly simple. The structure is flat. Create a directory “www-test.woohoosvcs.com” and create the following files

index.php

<html>
  <head>
    <title>Hello World from Google App Engine in PHP</title>
  </head>
    <body>
    <h1>Hello World from Google App Engine in PHP!</h1>
    <p>
     <? echo "This output is in php!" ?>
    </p>
  </body>
</html>

app.yaml

runtime: php73

service: www-test

We also need a dispatch/dispatch.yaml

dispatch:
- url: "www-test.woohoosvcs.com/"
  service: www-test

Now deploy!

% gcloud app deploy www-test.woohoosvcs.com/app.yaml dispatch/dispatch.yaml

Your services should then look like this!

You can see the default as well as www-test with a dispatch route.

Testing!

Now I pull up https://www-test.woohoosvs.com and expect my php page but I am sadly displeased. My default service responds. This is where I was for about an hour. Even the dispatch was giving me strange results until I loaded it up in curl and checked the headers

...
> GET / HTTP/2
> Host: www-test.woohoosvcs.com
> User-Agent: curl/7.64.1
> Accept: */*
...
< age: 30
< cache-control: public, max-age=600
...

For static pages like the default, Google App Engine appears to cache the results. The dynamic pages like php do not appear to have cache (as expected).

I was able to hit another google app engine front end that did not cache the results to test and it worked

curl -4 -kv https://www-test.woohoosvcs.com/ --resolve www-test.woohoosvcs.com:443:216.58.193.21

By the time I checked it again with the browser the cache had cleared and it was working as expected.

Final Words

Google App Engine is fairly flexible but it does have some nuances when trying to use custom domain names. If you are thinking about deploying something more complex outside of the default service, please do your testing ahead of time.

Moving WordPress Images To Google Storage

Summary

In my article Running Google App Engine Behind Cloudflare, the goal is to get to a point where horizontal scaling can happen. One of the final barriers is the location of the images. WordPress stores the images in wp-content on the local machine.

Horizontal Scaling

Once this is separated the WordPress site can be somewhat easily horizontally scaled. There are a few methods that can be used to achieve this, particularly in Google Cloud.

We could spin up more VMs and point them to the database
Load the WordPress Docker image into a Kubernetes Cluster
Run WordPress in Google App Engine

The actual method of horizontal scaling is out of scope for this document but this is the last barrier to get you to that decision.

References

I am going to give credit where it is due up front. Google’s tutorial on running WordPress on Google App Engine was a good starting point but not the first article I came across on this.

I came across this article from Kinsta which has some pretty good directions on a tool that looks extremely promising.

Storage Plugins

With anything WordPress, there is a plugin for it! Here are some options. We will choose one of these for this article.

Google Cloud Storage Plugin – I have not seen much on this one.
WP Offload Media – This one seems to have been around the longest but it will cost you to migrate existing content
WP-Stateless – Seems extremely promising. This shows up 3rd on the list but is the one we will implement in this article.

Preparation

For this article I decided it probably is not a good idea to make intrusive changes to this blog in order to generate more content. For this reason I decided to clone the production into staging.

Also, make sure to kick off a snapshot and backup of the database and VM beforehand.

Install WP-Stateless

We need to install and activate the WP-Stateless plugin.

Create Storage Bucket

WP-Stateless will guide you through creating the bucket but I wanted to do that manually to walk through the options. In the Storage / Browser section click “Create Bucket”. I had every intention of using a custom domain name but it does not appears to be supported according to Google.

With Cloudflare since I have it set to strict, it is expecting an origin cert. If I were to downgrade the site to flexible it would connect over 80 and likely work. This is a little bit concerning because Cloudflare’s CDN is great and this now bypasses it.

“Note: You can use a CNAME redirect only with HTTP, not with HTTPS. To serve your content through a custom domain over SSL, you can set up a load balancer.”
https://cloud.google.com/storage/docs/request-endpoints#cname

Name Bucket - use fully qualified domain name to help future proof but not required. — *Name Bucket – use fully qualified domain name to help future proof but not required.*

This is my lab so I chose the cheapest option but since we cannot use behind a secure CloudFlare, you may opt for multi-region. Google Storage is not a full CDN but there are tools you can layer on top of it for that.

Next we will set up the ACLs. This is a mistake I made during the initial setup assuming there were no ACLs. The result was that the sync would claim to work but nothing would actually happen. I left this to show the config mistake.

Configure WP-Stateless

It then runs you through a nice wizard. You can do a manual run as well. If you go that route, WP-Stateless’s instructions are fairly complete.

the first step asks you to login and it generates a json file for you so that it can authenticate.

In the configuration we need to set a few options. Namely stateless.

Due to the SSL issue we will leave domain blank. There is currently nowhere to upload the origin cert so a CNAME uses the A record’s SSL cert and would cause a cert mismatch. Being in strict mode, this won’t work but it may work in “full” or “flexible” since Cloudflare does not validate the cert in one case or use it in the other. If you are in “Full” or lower, give it a shot though!

Run a sync and you’re off!

Initially I was running on a micro instance with under 1GB RAM and it locked up and ran out of RAM. The default Bulk size is 1, you may need to go closer to 1. I re-ran this on a 1.7GB instance and ran with 1 and had no issues.

On the VM itself I validated images were removed.

$ find /var/www/html/wp-content/uploads/ | wc -l
441

$ find /var/www/html/wp-content/uploads/ | wc -l
42

There are still 42 images. We’ll track that down!

Some of these images did not have proper permissions. Since I manually synced the filesystems for this staging environment some of the newer images had incorrect permissions.

$ find /var/www/html/wp-content/uploads/ -ls
   131221      4 drwxr-xr-x   3 www-data www-data     4096 Oct 25 19:22 /var/www/html/wp-content/uploads/
   131223      4 drwxr-xr-x   4 www-data www-data     4096 Nov  1 08:38 /var/www/html/wp-content/uploads/2019
   131224      4 drwxr-xr-x   2 www-data www-data     4096 Nov  8 14:40 /var/www/html/wp-content/uploads/2019/10
   131891     12 -rw-r--r--   1 www-data www-data     9512 Nov  8 03:46 /var/www/html/wp-content/uploads/2019/10/http_1_1-100x100.png
   131849      4 -rw-r--r--   1 www-data www-data     2835 Nov  8 03:46 /var/www/html/wp-content/uploads/2019/10/logo-100x100.jpg
   131866     12 -rw-r--r--   1 www-data www-data     9200 Nov  8 03:46 /var/www/html/wp-content/uploads/2019/10/K2167-100x100.png
   131682      4 -rw-r--r--   1 www-data www-data     3324 Nov  8 03:46 /var/www/html/wp-content/uploads/2019/10/wh_header-100x100.jpg
   131229     20 drwxr-xr-x   2 www-data www-data    20480 Nov  8 14:40 /var/www/html/wp-content/uploads/2019/11
   147546     12 -rw-r--r--   1 root     root         8442 Nov  8 03:46 /var/www/html/wp-content/uploads/2019/11/GAE-Default-768x117.jpg
   147574      4 -rw-r--r--   1 root     root         3173 Nov  8 03:46 /var/www/html/wp-content/uploads/2019/11/GAE-Cert-300x52.jpg
   147550      8 -rw-r--r--   1 root     root         7206 Nov  8 03:46 /var/www/html/wp-content/uploads/2019/11/CF-DNS-Only-1024x48.jpg
   136683      8 -rw-r--r--   1 root     root         6938 Nov  8 03:46 /var/www/html/wp-content/uploads/2019/11/SSL-CF-Origin-300x140.jpg
   131918      4 -rw-r--r--   1 root     root         1604 Nov  8 03:46 /var/www/html/wp-content/uploads/2019/11/CF-DNS-Only-150x49.jpg
   147575     32 -rw-r--r--   1 root     root        31759 Nov  8 03:46 /var/www/html/wp-content/uploads/2019/11/mysql-purge-1024x186.jpg

Chown to the rescue

$ chown -R www-data:www-data /var/www/html/wp-content/uploads/*

Now we’re down to 8 after running again and these are actually unused images.

$ find /var/www/html/wp-content/uploads/ | wc -l
8

Final Words

Here we have used a free plugin to move our images to a shared and central repository. Due to my configuration and desire to keep it, it does not leverage Cloudflare’s CDN but you are able to make your own decision on that.

UPDATE: 20191109 – I can confirm lowering Cloudflare security to Full and adding the CNAME to c.storage.googleapis.com does allow this to work. It would be a decision point at the time of needing this whether I go that route.

At this point, at least in this test environment, I could spin up multiple front ends now to handle any excess of traffic.

Another benefit of this is it helps keep your VM light without having to store all of your images on it.

Running Google App Engine Behind Cloudflare

Summary

I had the need of a fairly static site in my infrastructure ecosystem. I thought, why not write an article about it with the nuance of putting it behind Cloudflare. There are much easier solutions for this static site, including running it off my WordPress server. In any case, this makes a neat introduction to Google App Engine.

In all fairness, this article is derived from Hosting a static website on Google App Engine but puts a slight spin with Cloudflare.

This is also not a usual use case for static hosting.

Requirements

Having a Google Cloud account is a must but that is already assumed. Next we need to download and install the Google Cloud SDK.

For MacOS its fairly simple. It is available here – Google Cloud SDK documentation

Installing the SDK

% pwd
/Users/dwcjr

% curl -O https://dl.google.com/dl/cloudsdk/channels/rapid/downloads/google-cloud-sdk-270.0.0-darwin-x86_64.tar.gz           

% tar xzf google-cloud-sdk-270.0.0-darwin-x86_64.tar.gz

% ./google-cloud-sdk/install.sh
Welcome to the Google Cloud SDK!
....

# We want to install this because our static site will be PHP based
% ./google-cloud-sdk/bin/gcloud components install app-engine-php

% ./google-cloud-sdk/bin/gcloud init
....
You are logged in as: [[email protected]].

Pick cloud project to use: 
 [1] woohoo-blog-2414
 [2] Create a new project
Please enter numeric choice or text value (must exactly match list 
item):  1
...

Deploying the App

Here I setup a www structure as per the Google article.

% pwd
/Users/dwcjr/Downloads/woohoosvcs.com
% find ./ -type d
./
.//www
.//www/css
.//www/images
.//www/js

% cat app.yaml 
runtime: php73

handlers:
- url: /
  static_files: www/index.html
  upload: www/index.html

- url: /(.*)
  static_files: www/\1
  upload: www/(.*)

We need to make sure www/index.html exists. Make a simple hello world in it. Something like the following would suffice

<html>
  <head>
    <title>Hello, world!</title>
  </head>
  <body>
    <h1>Hello, world!</h1>
    <p>
      This is a simple static HTML file that will be served from Google App
      Engine.
    </p>
  </body>
</html>

It produces a bit of output but a few minutes later, the app is deployed

% gcloud app deploy
Services to deploy:

descriptor:      [/Users/dwcjr/Downloads/woohoosvcs.com/app.yaml]
source:          [/Users/dwcjr/Downloads/woohoosvcs.com]
target project:  [woohoo-blog-2414]
target service:  [default]
target version:  [20191107t164351]
target url:      [https://woohoo-blog-2414.appspot.com]


Do you want to continue (Y/n)?  Y

Beginning deployment of service [default]...

Custom Hostnames

This is where some trickery happens. Not really, it is fairly straight forward, particularly with Cloudflare. We need to navigate to Settings / Custom Domains and add one. We will use www-test.woohoosvcs.com for this demo.

Walk through the setup with your domains

We then need to hop over to Cloudflare and add the CNAME as requested. Make sure it is setup as a “DNS only”. We do this so Google can validate the domain for its managed certificate. We do not want to use it but it will not allow us to place a custom one otherwise.

www-test CNAME - DNS only — www-test CNAME – DNS only

We are then going to hop on over to GAE’s Certificates section

I will then upload my stored copy from the WordPress site. After doing so an interesting issue happened.

The private key you've selected does not appear to be valid. — “The private key you’ve selected does not appear to be valid.”

This ended up being “XXX PRIVATE KEY” not being “XXX RSA PRIVATE KEY” so I simply modified the BEGIN and END to have RSA and it went through!

We then want to hop back over to Custom Names and disable managed security. This auto generates a certificate and we will be using the origin certificate instead.

Now if we click back on SSL Certificates it will allow us to drill into Cloudflare-Origin and assign.

SSL certificates - Cloudflare-Origin — SSL certificates – Cloudflare-Origin

We can then set the www-test.woohoosvcs.com back to Proxy to protect it

The Test!

We can test with curl to make sure it is going through Cloudflare.

% curl -v https://www-test.woohoosvcs.com
*   Trying 2606:4700:20::681a:d78...
* TCP_NODELAY set
* Connected to www-test.woohoosvcs.com (2606:4700:20::681a:d78) port 443 (#0)
.....
*  subject: C=US; ST=CA; L=San Francisco; O=Cloudflare, Inc.; CN=sni.cloudflaressl.com
.....
*  issuer: C=US; ST=CA; L=San Francisco; O=CloudFlare, Inc.; CN=CloudFlare Inc ECC CA-2
.....
* Using Stream ID: 1 (easy handle 0x7fd338005400)
> GET / HTTP/2
> Host: www-test.woohoosvcs.com
> User-Agent: curl/7.64.1
> Accept: */*
> 
* Connection state changed (MAX_CONCURRENT_STREAMS == 256)!
.....
< server: cloudflare
.....

Final Words

It may be a convoluted way but we now have a hosted static site behind Cloudflare using Strict TLS. All of that running on Google App Engine

In this article for the Custom Hostname, it should be possible to disable managed security so that www-test.woohoosvcs.com could start out as a Proxied entry and then associate the certificate. I had issues with that but it could have been me rebuilding the environment too quickly for the lab I was doing.

Upsizing WordPress MySQL to Google Cloud SQL

Summary

From my previous article How I Stood Up WordPress In a Day, we stood up a “Quick and Dirty” version. It was a fast and easy setup but an all in one. What happens if your WordPress site really takes off? This is not highly scalable as it is limited to the resources of the box. Perhaps your hosting provider lets you increase the size of the VM.

Scaling WordPress

Eventually you will get to a point where you reach the max. This is called vertical scaling. It is one of the easier methods but only gets so far and leads to monolithic infrastructures.

We need to be able to horizontally scale but adding highly redundant nodes. The database is the first piece of this. Since we implemented in Google Cloud, we will be using their managed SQL instance. In AWS this is called RDS.

Another issue we do not yet address is the fact that images are stored locally on the WordPress server itself. ~~We will address that later on and provide a link to that article.~~ Here is our article on that – Moving WordPress Images To Google Storage.

With that said, removing MySQL server from the WordPress server does leave more resources for the WordPress server itself.

Backup

Always run a backup before a major change. In this case we use Google Disk Snapshots for our Google VM and took one before.

Provisioning Google MySQL

For this, we opted the “create” method. Google does have the “migrate” option which involves adding the new instance as a read replica. This is a small WordPress site so we will simply create a backup and restore it and go from there.

Set instance information, passwords, etc.

Tutorial if you wish. This is a tutorial only.

Connecting to Google MySQL Instance

Here we will connect from the VM to the instance as root. You can see the Server version includes “Google”. We will then create the wordpress database and access. This is not the most secure of GRANT but we are copying what was there. It can be locked down based on best practices. The CREATE options will be highly dependent on your existing setup and we’ll talk about it further in the troubleshooting options

$ mysql -h 10.30.128.3 -u root -p
Enter password: 
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 50
Server version: 5.7.14-google-log (Google)

Copyright (c) 2000, 2019, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> CREATE DATABASE wordpress CHARACTER SET latin1 COLLATE latin1_swedish_ci;
mysql> GRANT ALL PRIVILEGES ON wordpress.* TO "wordpress"@"%" IDENTIFIED BY "XXXXXXXXXX";

mysql> FLUSH PRIVILEGES;

mysql> quit;

Now we need to backup the existing wordpress database

$ mysqladmin -u wordpress -p wordpress > 20191107-wordpress.sql

And then import it

$ mysql -h 10.30.128.3 -u wordpress -p wordpress < 20191107-wordpress.sql
Enter password: 
$

Validation

In the Google console we can then validate this.

WordPress Config

We need to modify wp-config.php as follows. If your user and password changed, those need to be updated as well.

// ** MySQL settings - You can get this info from your web host ** //
/** The name of the database for WordPress */
define( 'DB_NAME', 'wordpress' );

/** MySQL database username */
define( 'DB_USER', 'wordpress' );

/** MySQL database password */
define( 'DB_PASSWORD', 'XXXXXX' );

/** MySQL hostname */
#define( 'DB_HOST', 'localhost' );
define( 'DB_HOST', '10.30.128.3' );

Testing

Once you save the changes go to your WordPress site and test. If you get an install.php page, stop right there and back out the change. We have some troubleshooting steps below.

Backup Again!

If validation is successful, run a backup again, both in for the Google disk snapshot for the VM and in the Google MySQL instance. This way we have a known good immediately following the migration.

It is highly recommended to backup before you perform any of the short or long term decommission.

Troubleshooting

There are a few causes to get redirected to the install.php page

Incorrect database settings, including host, user, password, database name and table prefixes
Collation/Characterset – case insensitive versus sensitive
Not actually importing the database
Improper wordpress user permissions

Collation

This can be checked using the following commands. It is best to keep the same settings when creating. It is also described here – https://stackoverflow.com/questions/9827164/wordpress-keeps-redirecting-to-install-php-after-migration

mysql> SELECT @@character_set_database, @@collation_database;
+--------------------------+----------------------+
| @@character_set_database | @@collation_database |
+--------------------------+----------------------+
| latin1                   | latin1_swedish_ci    |
+--------------------------+----------------------+
1 row in set (0.00 sec)

Database Connectivity

The others can be wrapped up into database connectivity. We tested this by connecting as the wordpress user and importing as that user after the database was connected.

Decomissioning old database

At each of these steps it is important to test the site to ensure it doesn’t break. If you are still somehow pointing to your local mysql instance, it can break. You will definitely find that out during these steps.

Short Term

We do not want old mysql data laying around so the first steps to complete afterwards are to disable and shutdown mysql. This also helps us confirm we are using the new MySQL instance.

$ sudo systemctl stop mysql
$ sudo systemctl disable mysql
Synchronizing state of mysql.service with SysV service script with /lib/systemd/systemd-sysv-install.
Executing: /lib/systemd/systemd-sysv-install disable mysql
Removed /etc/systemd/system/multi-user.target.wants/mysql.service.

Long Term

Long term we want to delete the backup so we don’t have extra data laying around and remove mysql and its database files.

$ sudo apt-get remove --purge mysql-server mysql-community-server

Now let’s remove our backup.

$ rm 20191107-wordpress.sql

Final Words

At this point we have accomplished scaling to a potentially highly available database. This database also can be dynamically sized to accommodate extra load. For the sake of this article, we chose the smallest size possible due to the current load. Should this go viral though, the database can easily be scaled.

Spinning Up Rancher With Kubernetes

Summary

The Rancher ecosystem is an umbrella of tools. We will specifically be talking about the Rancher product or sometimes referred to as Rancher Server. Rancher is an excellent tool for managing and monitoring your Kubernetes cluster, no matter where it exists.

Requirements and Setup

The base requirement is just a machine that has docker. For the sake of this article, we will use their RancherOS to deploy.

RancherOS touts itself at being the lightest weight OS capable of running docker. All of the system services have been containerized as well. The most difficult part of installing “ros” is using the cloud-init.yaml to push your keys to it!

We will need the installation media as can be found here

The minimum requirements state 1GB of RAM but I had issues with that and bumped my VM up to 1.5GB. It was also provisioned with 1 CPU Core and 4GB HDD.

A cloud-config.yml should be provisioned with your ssh public key

#cloud-config
ssh_authorized_keys:
  - ssh-rsa XXXXXXXXXXXXXXXXXXXXXXXXXXXXX

We also assume you will be picking up from the Intro to Kubernetes article and importing that cluster.

Installing RacherOS

On my laptop I ran the following command in the same directory that I have the cloud-config.yml. This is a neat way to have a quick and dirty web server on your machine.

python -m SimpleHTTPServer 8000

In the rancher window

sudo ros install -c http://192.168.116.1:8000/cloud-config.yml -d /dev/sda

A few prompts including a reboot and you will be asking yourself if it was just that easy? When it boots up, it shows you the IP to make it that much easier to remotely connect. Afterall, you are only enabled for ssh key auth at this point and cannot really login at the console.

 % ssh [email protected]
The authenticity of host '192.168.116.182 (192.168.116.182)' can't be established.
ECDSA key fingerprint is SHA256:KGTRt8HZu1P4VFp54vOAxf89iCFZ3jgtmdH8Zz1nPOA.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '192.168.116.182' (ECDSA) to the list of known hosts.
Enter passphrase for key '/Users/dwcjr/.ssh/id_rsa': 
[rancher@rancher ~]$

Starting Up Rancher

And we’re in! We will then do a single node self-signed cert install per – https://rancher.com/docs/rancher/v2.x/en/installation/single-node/

[rancher@rancher ~]$ docker run -d --restart=unless-stopped \
> -p 80:80 -p 443:443 \
> rancher/rancher:latest
Unable to find image 'rancher/rancher:latest' locally
latest: Pulling from rancher/rancher
22e816666fd6: Pull complete 
079b6d2a1e53: Pull complete 
11048ebae908: Pull complete 
c58094023a2e: Pull complete 
8a37a3d9d32f: Pull complete 
e403b6985877: Pull complete 
9acf582a7992: Pull complete 
bed4e005ec0d: Pull complete 
74a2e9817745: Pull complete 
322f0c253a60: Pull complete 
883600f5c6cf: Pull complete 
ff331cbe510b: Pull complete 
e1d7887879ba: Pull complete 
5a5441e6019b: Pull complete 
Digest: sha256:f8751258c145cfa8cfb5e67d9784863c67937be3587c133288234a077ea386f4
Status: Downloaded newer image for rancher/rancher:latest
76742197270b5154bf1e21cf0ba89479e0dfe1097f84c382af53eab1d13a25dd
[rancher@rancher ~]$

Connect via HTTPS to the rancher server and you’ll get the new user creation for admin

The next question is an important design decision. The Kubernetes nodes that this will be managing need to be able to connect to the rancher host. the reason being is agents are deployed that phone home. The warning in this next message is ok for this lab.

Importing a Cluster

In this lab I have been getting the following error but click over to clusters and it moves on with initializing.

Failed while: Wait for Condition: InitialRolesPopulated: True

It will stay in initializing for a little bit. Particularly in this lab with minimal resources. We are waiting for “Pending”.

Now that it is pending we can edit it for the kubectl command to run on the nodes to deploy the agent

Copy the bottom option to the clipboard since we used a self-signed cert that the Kubernetes cluster does not trust.

Deploying the Agent

Run the curl!

root@kube-master [ ~ ]# curl --insecure -sfL https://192.168.116.182/v3/import/zdd55hx249cs9cgjnp9982zd2jbj4f5jslkrtpj97tc5f4xk64w27c.yaml | kubectl apply -f -
clusterrole.rbac.authorization.k8s.io/proxy-clusterrole-kubeapiserver created
clusterrolebinding.rbac.authorization.k8s.io/proxy-role-binding-kubernetes-master created
namespace/cattle-system created
serviceaccount/cattle created
clusterrolebinding.rbac.authorization.k8s.io/cattle-admin-binding created
secret/cattle-credentials-79f50bc created
clusterrole.rbac.authorization.k8s.io/cattle-admin created
deployment.apps/cattle-cluster-agent created
The DaemonSet "cattle-node-agent" is invalid: spec.template.spec.containers[0].securityContext.privileged: Forbidden: disallowed by cluster policy

Boo – what is “disallowed by cluster policy”? This is a permission issue

On Kubernetes 1.14 you can set “–allow-privileged=true” on the apiserver and kubelet. It is deprecated in higher versions. Make that change on our 1.14 cluster and we’re off to the races!

root@kube-master [ ~ ]# vi /etc/kubernetes/apiserver
root@kube-master [ ~ ]# vi /etc/kubernetes/kubelet
root@kube-master [ ~ ]# systemctl restart kube-apiserver.service kubelet.service 
root@kube-master [ ~ ]# curl --insecure -sfL https://192.168.116.182/v3/import/zdd55hx249cs9cgjnp9982zd2jbj4f5jslkrtpj97tc5f4xk64w27c.yaml | kubectl apply -f -
clusterrole.rbac.authorization.k8s.io/proxy-clusterrole-kubeapiserver unchanged
clusterrolebinding.rbac.authorization.k8s.io/proxy-role-binding-kubernetes-master unchanged
namespace/cattle-system unchanged
serviceaccount/cattle unchanged
clusterrolebinding.rbac.authorization.k8s.io/cattle-admin-binding unchanged
secret/cattle-credentials-79f50bc unchanged
clusterrole.rbac.authorization.k8s.io/cattle-admin unchanged
deployment.apps/cattle-cluster-agent unchanged
daemonset.apps/cattle-node-agent created

Slow races but we’re off. Give it a good few minutes to make some progress. While we wait for this node to provision, set the “–allow-privileged=true” on the other nodes in /etc/kubernetes/kubelet

We should now see some nodes and the status has changed to “waiting” and we will do just that. By now, if you haven’t realized, Kubernetes is not “fast” on the provisioning. Well at least in these labs with minimal resources 🙂

Checking on the status I ran into this. My first thought was RAM on the master node. I have run into this enough before.

This cluster is currently Provisioning; areas that interact directly with it will not be available until the API is ready.
Exit status 1, unable to recognize "management-state/tmp/yaml-705146024": the server is currently unable to handle the request unable to recognize "management-state/tmp/yaml-705146024" — This cluster is currently **Provisioning**; areas that interact directly with it will not be available until the API is ready.
Exit status 1, unable to recognize “management-state/tmp/yaml-705146024”: the server is currently unable to handle the request unable to recognize “management-state/tmp/yaml-705146024”

Sure enough, running top and checking the console confirmed that.

kube-master out of ram. Time to increase a little to cover the overhead of the agent. Went from 768MB to 1024MB and back up and at ’em!

It did sit at the following error for some time.

This cluster is currently Provisioning; areas that interact directly with it will not be available until the API is ready.
Exit status 1, unable to recognize "management-statefile_path_redacted":

Some indications show this eventually resolves itself. Others have indicated adding a node helps kick off the provisioning to continue. In my case a good 10 minutes and we’re all green now!

Rancher Cluster Dashboard! — Rancher Cluster Dashboard

Navigating Around

We saw the cluster area. Let’s drill into the nodes!

Rancher partitions the clusters into System and Default. This is a carryover from "ros" which does the same to the OS. — Rancher partitions the clusters into System and Default. This is a carryover from “ros” which does the same to the OS.

Final Words

Rancher extends the functionality of Kubernetes, even on distributions of Kubernetes that are not Rancher. Those extensions are beyond the scope of this article. At the end of this article though you have a single node Rancher management tool that can manage multiple clusters. We did so with RancherOS. Should you want to do this in production it is recommended to have a “management” Kubernetes cluster to make rancher highly available and use a certificate truted by Kubernetes, from the trusted CA cert.

When shutting down this lab, I saw that the kube-node1/2 ran out of memory and I had to increase them to 1GB as well for future boots to help avoid this.

Network Configuration Management With Rancid

Summary

Network Configuration management is many times overlooked. Better yet, companies with strong Change Management practices believe they do not need config management because of this.

The issue is that sometimes commands entered to network gear do not take effect as we expect or we want to compare history and easily diff changes for root cause analysis.

Rancid

Rancid is a free open source tool to handle just this. I have successfully used it for the past few years. It has been a great tool and caught a typo from time to time as well as unexpected application of commands.

At a high level, the way it works is to pull a full config each time and push it into a version control system like CVS or Subversion. Git is also a popular choice but not really necessary as we will not be branching.

Once the configs are pumped into a versioning system, it is easy to produce diffs and any time rancid runs, it outputs the diffs so you can see the change.

The initial setup of rancid is often a barrier to entry. Once you get it setup the first time, upgrades are fairly simple.

Installing

For this demo, we are using a VM. We installed a minimal install CentOS 8.0 on a 1GB RAM, 10GB HDD with 1 CPU core. Production specs are not much more than this depending on how many devices you are querying and how often.

Let’s download the tar first!

[root@rancid ~]# curl -O https://shrubbery.net/pub/rancid/rancid-3.10.tar.gz

We need to install some dependencies! Expect is the brains of rancid and used to send and receive data from the network devices. Many of the modules that manipulate the data received are perl. Gcc and make are used to build the source code.

We need some sort of mailer, hence sendmail. You can use postfix if you prefer that.

We will be using CVS for simplicity and the default configuration of rancid.

[root@rancid rancid-3.10]# yum install expect perl gcc make sendmail

[root@rancid rancid-3.10]# yum install https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm

[root@rancid rancid-3.10]# yum install cvs

We then need to extract and build the source!

[root@rancid ~]# tar xzf rancid-3.10.tar.gz 

[root@rancid ~]# ls -la | grep rancid
drwxr-xr-x.  8 7053 wheel   4096 Sep 30 18:15 rancid-3.10
-rw-r--r--.  1 root root  533821 Nov  5 06:13 rancid-3.10.tar.gz

[root@rancid ~]# cd rancid-3.10



[root@rancid ~]# ./configure
checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
checking for a thread-safe mkdir -p... /usr/bin/mkdir -p
checking for gawk... gawk
.....
config.status: creating include/config.h
config.status: executing depfiles commands

[root@rancid rancid-3.10]# make
... tons of output
gmake[1]: Leaving directory '/root/rancid-3.10/share'

[root@rancid rancid-3.10]# make install

[root@rancid rancid-3.10]# ls -la /usr/local/rancid/
total 4
drwxr-xr-x.  7 root root   63 Nov  5 06:21 .
drwxr-xr-x. 13 root root  145 Nov  5 06:20 ..
drwxr-xr-x.  2 root root 4096 Nov  5 06:21 bin
drwxr-xr-x.  2 root root   90 Nov  5 06:21 etc
drwxr-xr-x.  3 root root   20 Nov  5 06:21 lib
drwxr-xr-x.  4 root root   31 Nov  5 06:21 share
drwxr-xr-x.  2 root root    6 Nov  5 06:20 var

We very likely do not want this to run as root so we will need to create a user. By default, rancid gets installed to /usr/local/rancid so we will set that to the user’s home directory

[root@rancid rancid-3.10]# useradd -d /usr/local/rancid -M -U rancid

[root@rancid rancid-3.10]# chown rancid:rancid /usr/local/rancid/
[root@rancid rancid-3.10]# chown -R rancid:rancid /usr/local/rancid/*

[root@rancid rancid-3.10]# su - rancid
[rancid@rancid ~]$ pwd
/usr/local/rancid

To preserve permissions, all further changes should be made under the rancid user.

Configuring Rancid

The global rancid configuration, rancid.conf is dictated by the following format – https://www.shrubbery.net/rancid/man/rancid.conf.5.html

We will need to modify the following line

# list of rancid groups
LIST_OF_GROUPS="networking"

Configuring Devices

cloginrc

This follows a specific format as described here – https://www.shrubbery.net/rancid/man/cloginrc.5.html

[rancid@rancid ~]$ cat .cloginrc
add user test-f5	root
add password test-f5	XXXXXXX

router.db

This follows a specific format as described here – https://www.shrubbery.net/rancid/man/router.db.5.html

For our example we put in the following line. Please keep in mind you can use any name you wish but it has to either resolve via DNS or hosts file

[rancid@rancid var]$ cat router.db 
test-f5;bigip;up

First Run

[rancid@rancid ~]$ bin/rancid-run
[rancid@rancid ~]$

Well that was anticlimactic. Rancid typically doesn’t output at the console and reserves that for the logs in ~/var/logs

[rancid@rancid logs]$ pwd
/usr/local/rancid/var/logs
[rancid@rancid logs]$ ls -altrh
total 4.0K
drwxr-xr-x. 3 rancid rancid  35 Nov  5 07:00 ..
-rw-r-----. 1 rancid rancid 270 Nov  5 07:00 networking.20191105.070023
drwxr-x---. 2 rancid rancid  40 Nov  5 07:00 .

[rancid@rancid logs]$ cat networking.20191105.070023 
starting: Tue Nov 5 07:00:23 CST 2019

/usr/local/rancid/var/networking does not exist.
Run bin/rancid-cvs networking to make all of the needed directories.

ending: Tue Nov 5 07:00:23 CST 2019
[rancid@rancid logs]$

Ok, let’s run rancid-cvs. Its nice that it will create the repos for you. It both versions the router configs and the router.db files

[rancid@rancid ~]$ ~/bin/rancid-cvs

No conflicts created by this import

cvs checkout: Updating networking
Directory /usr/local/rancid/var/CVS/networking/configs added to the repository
cvs commit: Examining configs
cvs add: scheduling file `router.db' for addition
cvs add: use 'cvs commit' to add this file permanently
RCS file: /usr/local/rancid/var/CVS/networking/router.db,v
done
Checking in router.db;
/usr/local/rancid/var/CVS/networking/router.db,v  <--  router.db
initial revision: 1.1
done

# Proof of CVS creation
[rancid@rancid ~]$ find ./ -type d -name CVS
./var/CVS
./var/networking/CVS
./var/networking/configs/CVS

Rancid-run again!

[rancid@rancid ~]$ cd var/logs
[rancid@rancid logs]$ ls -altrh
total 8.0K
-rw-r-----. 1 rancid rancid 270 Nov  5 07:00 networking.20191105.070023
drwxr-xr-x. 5 rancid rancid  64 Nov  5 07:04 ..
drwxr-x---. 2 rancid rancid  74 Nov  5 07:05 .
-rw-r-----. 1 rancid rancid 741 Nov  5 07:05 networking.20191105.070555
[rancid@rancid logs]$ cat networking.20191105.070555
starting: Tue Nov 5 07:05:55 CST 2019

cvs add: scheduling file `.cvsignore' for addition
cvs add: use 'cvs commit' to add this file permanently
cvs add: scheduling file `configs/.cvsignore' for addition
cvs add: use 'cvs commit' to add this file permanently

cvs commit: Examining .
cvs commit: Examining configs
RCS file: /usr/local/rancid/var/CVS/networking/.cvsignore,v
done
Checking in .cvsignore;
/usr/local/rancid/var/CVS/networking/.cvsignore,v  <--  .cvsignore
initial revision: 1.1
done
RCS file: /usr/local/rancid/var/CVS/networking/configs/.cvsignore,v
done
Checking in configs/.cvsignore;
/usr/local/rancid/var/CVS/networking/configs/.cvsignore,v  <--  .cvsignore
initial revision: 1.1
done

ending: Tue Nov 5 07:05:56 CST 2019

The router.db we created in ~/var/router.db needs to move to ~/var/networking/router.db

[rancid@rancid var]$ mv ~/var/router.db ~/var/networking/
[rancid@rancid var]$ ~/bin/rancid-run 
[rancid@rancid var]$ cd logs
[rancid@rancid logs]$ ls -la
total 12
drwxr-x---. 2 rancid rancid  108 Nov  5 07:08 .
drwxr-xr-x. 5 rancid rancid   47 Nov  5 07:08 ..
-rw-r-----. 1 rancid rancid  270 Nov  5 07:00 networking.20191105.070023
-rw-r-----. 1 rancid rancid  741 Nov  5 07:05 networking.20191105.070555
-rw-r-----. 1 rancid rancid 1899 Nov  5 07:08 networking.20191105.070840

[rancid@rancid logs]$ cat networking.20191105.070840
starting: Tue Nov 5 07:08:40 CST 2019

/usr/local/rancid/bin/control_rancid: line 433: sendmail: command not found
cvs add: scheduling file `test-f5' for addition
cvs add: use 'cvs commit' to add this file permanently
RCS file: /usr/local/rancid/var/CVS/networking/configs/test-f5,v
done
Checking in test-f5;
/usr/local/rancid/var/CVS/networking/configs/test-f5,v  <--  test-f5
initial revision: 1.1
done
Added test-f5



Trying to get all of the configs.
test-f5: missed cmd(s): all commands
test-f5: End of run not found
test-f5 clogin error: Error: /usr/local/rancid/.cloginrc must not be world readable/writable
#

This file does have passwords afterall, let’s lock it down

[rancid@rancid ~]$ chmod 750 .cloginrc 
[rancid@rancid ~]$

Iterative Approach

I went through a few iterations of troubleshooting and looking at the logs. I did this because nearly nobody gets the install 100% correct the first time. Therefore, its great to understand how to check the logs and make changes accordingly.

The final cloginrc looks like this

[rancid@rancid ~]$ cat .cloginrc
add user test-f5	root
add password test-f5	XXXXXXXXXX

#defaults for most devices
add autoenable *	1
add method *		ssh

The rancid.conf needed this line changed

SENDMAIL="/usr/sbin/sendmail"

And now we have a clean run!

[rancid@rancid ~]$ cat var/logs/networking.20191105.073209
starting: Tue Nov 5 07:32:09 CST 2019



Trying to get all of the configs.
All routers successfully completed.

cvs diff: Diffing .
cvs diff: Diffing configs
cvs commit: Examining .
cvs commit: Examining configs

ending: Tue Nov 5 07:32:20 CST 2019

Scheduled Runs

On UNIX, crontab is the typical default to run scheduled jobs and here is a good one to run. You can edit your crontab by running “crontab -e” or list it by running “crontab -l”

#Run config differ twice daily
02 1,14 * * * /usr/local/rancid/bin/rancid-run

#Clean out config differ logs
58 22 * * * /usr/bin/find /usr/local/rancid/var/logs -type f -mtime +7 -delete

This crontab runs rancid 2 minutes after the hour at 02:02 and 14:02. It then clears logs older than 7 days every 24 hours at 22:58. We do not want the drive to fill up due to noisy logs.

Web Interface

Rancid is nearly 100% CLI but there are addon tools for CVS that we can use. Namely cvsweb. FreeBSD was a heavy user of CVS and created this project/package.

cvsweb will require apache and “rcs”. RCS does not yet exist in EPEL for CentOS 8.0

[root@rancid ~]# yum install httpd

[root@rancid ~]# curl -O https://people.freebsd.org/~scop/cvsweb/cvsweb-3.0.6.tar.gz
[root@rancid ~]# tar xzf cvsweb-3.0.6.tar.gz

[root@rancid cvsweb-3.0.6]# cp cvsweb.cgi /var/www/cgi-bin/

[root@rancid cvsweb-3.0.6]# mkdir /usr/local/etc/cvsweb/
[root@rancid cvsweb-3.0.6]# cp cvsweb.conf /usr/local/etc/cvsweb/

[root@rancid httpd]# chmod 755 /var/www/cgi-bin/cvsweb.cgi

We need to tell cvsweb where the repo is! Find the following section to add ‘Rancid’ in /usr/local/etc/cvsweb/cvsweb.conf

@CVSrepositories = (
        'Rancid'  => ['Rancid Repository', '/usr/local/rancid/var/CVS'],

Now let’s start up apache and let it rip!

[root@rancid cvsweb-3.0.6]# systemctl enable httpd
Created symlink /etc/systemd/system/multi-user.target.wants/httpd.service → /usr/lib/systemd/system/httpd.service.
[root@rancid cvsweb-3.0.6]# systemctl start httpd

# Enable port 80 on firewall

[root@rancid httpd]# firewall-cmd --zone=public --add-service=http --permanent
success
[root@rancid httpd]# firewall-cmd --reload
success

Wait, it still doesn’t work. Let’s check /var/log/httpd/error_log

Can't locate IPC/Run.pm in @INC (you may need to install the IPC::Run module) (@INC contains: /usr/local/lib64/perl5 /usr/local/share/perl5 /usr/lib64/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib64/perl5 /usr/share/perl5) at /var/www/cgi-bin/cvsweb.cgi line 100.
BEGIN failed--compilation aborted at /var/www/cgi-bin/cvsweb.cgi line 100.
[Tue Nov 05 08:02:58.091776 2019] [cgid:error] [pid 20354:tid 140030446114560] [client ::1:37398] End of script output before headers: cvsweb.cgi

On CentOS 8 – It seems the best way to get this is via https://centos.pkgs.org/8/centos-powertools-x86_64/perl-IPC-Run-0.99-1.el8.noarch.rpm.html

[root@rancid httpd]# dnf --enablerepo=PowerTools install perl-IPC-Run

Then I ran into the following issue which seems to be a known bug. I manually edited the file as recommended in the patch.

"my" variable $tmp masks earlier declaration in same statement at /var/www/cgi-bin/cvsweb.cgi line 1338.
syntax error at /var/www/cgi-bin/cvsweb.cgi line 1195, near "$v qw(hidecvsroot hidenonreadable)"
Global symbol "$v" requires explicit package name (did you forget to declare "my $v"?) at /var/www/cgi-bin/cvsweb.cgi line 1197.
Global symbol "$v" requires explicit package name (did you forget to declare "my $v"?) at /var/www/cgi-bin/cvsweb.cgi line 1197.
syntax error at /var/www/cgi-bin/cvsweb.cgi line 1276, near "}"
  (Might be a runaway multi-line << string starting on line 1267)
syntax error at /var/www/cgi-bin/cvsweb.cgi line 1289, near "}"
syntax error at /var/www/cgi-bin/cvsweb.cgi line 1295, near "}"
syntax error at /var/www/cgi-bin/cvsweb.cgi line 1302, near "}"
syntax error at /var/www/cgi-bin/cvsweb.cgi line 1312, near "}"
syntax error at /var/www/cgi-bin/cvsweb.cgi line 1336, near "}"
syntax error at /var/www/cgi-bin/cvsweb.cgi line 1338, near ""$tmp,v" }"
/var/www/cgi-bin/cvsweb.cgi has too many errors.

Are we there yet?

Yay - We can see the CVS root! — Yay – We can see the root!

And we can drill into router.db and other areas!

Security

We really should secure this page because 1) We are running perl scripts and cgi-bin is notoriously insecure. For 2) We have router configs, possibly with passwords and passphrases.

[root@rancid ~]# htpasswd -c /etc/httpd/.htpasswd dwchapmanjr
New password: 
Re-type new password: 
Adding password for user dwchapmanjr
[root@rancid ~]#

Create the /var/www/cgi-bin/.htaccess

AuthType Basic
AuthName "Restricted Content"
AuthUserFile /etc/apache2/.htpasswd
Require valid-user

Set permissions

[root@rancid html]# chmod 640 /etc/apache2/.htpasswd 
[root@rancid html]# chmod 640 /var/www/cgi-bin/.htaccess
[root@rancid html]# chown apache /etc/apache2/.htpasswd 
[root@rancid html]# chmod apache /var/www/cgi-bin/.htaccess

We then want to Allow overrides so that the .htaccess will actually work by editing /etc/httpd/conf/httpd.conf

# Change Allow Override to All

<Directory "/var/www/cgi-bin">
    #AllowOverride None
    AllowOverride All
    Options None
    Require all granted
</Directory>

And then “systemctl restart httpd”

With any luck you should get a user/pass prompt now! It is not the most secure but it is something.

Final Words

In this article we have stood up rancid from scratch. We have also gone over some basic troubleshooting steps and configured apache and cvsweb to visually browse the files.