In my article Intro To Azure Active Directory Domain Services we discussed environments with minimal infrastructure. With all of the RDP exploits it is typically best not to expose RDP over the internet. Since Bastion is not yet fully available the next best thing aside from setting up a VPN appliance is to use the Point-to-site functionality of a Virtual Network Gateway.
Prerequisites
The first pre-requisite for client VPN using a Virtual Network Gateway is to actually provision one. For OpenVPN compatibility it does require at least SKU VpnGw1 and will not work with basic.
It will require 2 subnets, one for the inside leg of the gateway and another for the client-side pool.
The Virtual Network Gateway does want an inside subnet dedicated for use to the Virtual Network Gateway and not shared amongst other devices.
Authentication is handled either via radius or certificate based. If you are reading this article for a minimized infrastructure you probably do not have radius servers.
Provisioning
The provisioning process is fairly simple although it can take 30-60 minutes for the Virtual Network Gateway to fully provision before you can use the Point-to-site configuration. There are a few simple questions.
That’s really it for the initial provisioning.
Configuration
Some basic Point-to-site configurations need to be set.
The next part is the most difficult part of this. A root and at least one child certificate have to be provisioned. Microsoft has some good documentation on it. To do in Powershell, it does require Windows 10 or Server 2016 or higher.
The name is arbritrary but the “Public certificate data” is the area between the “—BEGIN CERTIFICATE—” section and the “—END CERTIFICATE”
It can be a pain for those of us not familiar with certificates and command line tools like openssl. The idea is that you have a root certificate authority that then issues individual certificates per user or group of users. If that key becomes compromised you can then revoke the individual certificate or untrust the entire certificate authority. I like the idea of creating a CA chain per organization you grant access to.
In this article we walked through creating the resources required and configuring. We did rely heavily upon the Microsoft documentation but it was fairly complete and well shown.
Recently I got the pleasure to stand up Azure Active Directory Domain Services (AADDS) for a client. It has some specific use cases and limitations but can be great for your environment.
This article walks you through the background, some of the limitations and also how to provision.
What is Azure Active Directory Domain Services?
This is a natural question. At a high level, let’s talk about the use case for it. In a small cloud environment with a minimal number of VMs, standing up redundant Domain Controllers can chew up a bit of your budget. From there you have to manage and monitor the domain controllers.
AADDS is a managed platform for this. On the back end it appears to be appliances that provide most of the functionality of Active Directory but with some limitations. It also offers a one way sync from Azure Active Directory (AAD) to AADDS. This allows you to use Azure Active Directory credentials to login to VMs.
Use Case
As touched on above, most administrators want a central credential repository for their cloud VMs. If they are used to Active Directory on prem but want to tie this to Azure AD, AADDS makes sense, particularly in small environments where there may not be a full budget to host, manage and maintain full domain controllers with adequate redundancy.
Limitations
There are some notable limitations so if you are looking for the full flexibility of native Domain Controllers, AADDS is not that.
One Way Replication
Replication only occurrs from AAD to AADDS. You can make changes in AADDS via normal Active Directory Users and Computers (ADUC) or other tools but they do not replicate back to AAD. AAD can however overwrite these changes.
Azure Active Directory does support AD Connect to connect to on prem domain controllers but it is not compatible with AADDS at this time.
No Custom Attributes
It does not support extending the schema or adding extra custom attributes
External Guests
While external guest accounts will replicate into AADDS, their password and hashes are inaccessible so you will not be able to login under these accounts. An AAD account local to tenant AD will have to be provisioned.
Lack of Control
When you create or change AAD attributes, you have no control over the replication interval. The initial sync may take an hour or two. The provisioning of AADDS takes quite a bit of time as well. In practice, on a smaller environment, replication does seem to happen within 5-15 minutes.
Provisioning
The first step is to select this from the Market and click create
On the basics tab, pay particular attention to the DNS Domain name. Best practices for domain naming has changed over the years. These settings cannot be changed after implementing.
On the networking tab it will by default create a new Virtual Network and Subnet. You can choose an existing virtual network which is highly likely. It does however, indicate you should use a not in use subnet for this. I have not found much information on that but it may have to do with the Network Security Group it creates and applies.
On the next screen you will elect who can manage this environment. Users in this group will function as the “Domain Admins” you are used to in Active Directory.
Next we can choose which objects to synchronize but typically it will be all objects. If you think you may want to scope it though, choose that first as you can go from scoped to all but have to recreate to reverse.
Finally, we review the config but pay note to this. In order for this all to work, your AAD hashes have to be stored in the AADDS.
Final Words
In this article we gave a brief overview of Azure Active Directory Domain Services and some use cases. We also went over some limitations and cases where it would not be a good fit. Finally we walked through the actual deployment.
Earlier this year, my wife decided to start her own hair salon and do booth rental (Pretty Hare Salon). We set up square for appointment setting and credit cards. It has nice reporting features but there is literally no forecasting. Forecasting helps us determine pricing and when to increase or decrease marketing/advertisement.
Parameters of the Problem
To work around this lack of forecasting, I found an export feature that exports appointments in csv. Unfortunately it does not list the pricing associated with it. At the time, I implemented something in perl but decided to rewrite in Golang. The application inputs two CSV files (1 – appointments and 2 – pricing) and calculates weekly totals based on that.
Helper Function
Error checking and reporting requirements are fairly minimal for this so I use a helper function that is fairly basic.
In order to load the CSV into a variable, I opted to use os.Open() which takes the file path. I pass these in through command-line arguments using os.Args. os.Open returns a pointer to a file.
The Appointment file is the full history and we do not need all the data in it. It can get very large over time so it is best not to completely load it into memory. We also will have logic to ignore anything older than 2 weeks as this is a forecast and we do not care about historicals.
for {
appointment, err := appointments.Read()
if err == io.EOF {
break
}
// Business logic in here
}
The appointment array in this can then be read as a single dimension. Something similar to what is below exists in the for loop previously shown.
if appointment[2] == "accepted" {
// More business logic here
}
Looking Up Data
The pricing information which we fully loaded into the two dimensional slice can be iterated as follows
for _, price := range priceRecords {
if price[0] == service {
return strconv.ParseFloat(price[8][1:], 64)
}
}
return 0, errors.New("Could not find service - " + service)
The “service” variable is the service we are looking up. In the price slice, the first or 0 position is the actual service name we are looking up and position 8 is the actual price.
The price field (8) has a dollar sign so I just used string slicing to omit the first character which is the dollar sign so I could parse the float like this
price[8][1:]
The pricing file is highly controlled so I have very little error checking here.
Final Words
In this article we discussed opening and iterating through CSV files using two methods. One involves fully loading the file into memory and the other involves iterating line by line. At this point you should have a good idea of which method to use and when.
I was writing an SSL Certificate Decoder ( https://tools.woohoosvcs.com/certdecoder/ ) and came across an interesting problem. I wanted to validate a certificate chain but had some difficulty figuring it out. The problem was that I needed to find the most child certificate in a seemingly random list of certificates. What I am calling the most child certificate is the one at the bottom of the chain that is issued but does not issue one of its own. Users can provide the cert chain in any order so we cannot relay on the order they are submitted for this tool.
SSL Certificate Chains
To understand the problem, one needs to understand an SSL Certificate Chain. When traversing a chain, each certificate is an individual certificate. How they are linked is by their issuer. There is a subject section about the certificate itself and then an issuer section which indicates the certificate authority certificate that issued that. It goes all the way up until you hit a root certificate. The root certificates sign themselves.
Here you can see the cert chain for this site. The SAN or alternate names that allow tools.woohoosvcs.com be valid are further down below this screenshot.
Recursion
It had been a few decades since I sat in a Computer Science class but I do remember a few control structures. One of which was recursion. Recursion can be a difficult concept to grasp and an even more difficult concept to implement. Golang has an example of this here. At a high level a recursive function is one that calls itself. The call stack gets nested in each layer. In the case that you do not know how many certificates and iterations you need to go through to find the chain, recursive functions help with this. Alternatively you would have to use multiple layers of “for” loops to iterate through, matching the current cert against one that may be a child. In a past life I may have implemented a few layers of “for” loops to statically account for the most common scenarios.
Recursion can be tricky and if you do not use it right, it can involve stack overflows if the function indefinitely calls it self. The key to a good recursive function is a way out. It needs to be able to exit without continuing to call itself forever. There are also limits based on the platform as to how deep you can go in this recursion. Your platform will have a limit at which point an exception or error is thrown.
It is still a work in progress but after iterations of playing with linked lists, multiple for loops, this is what I landed on.
func findChildOfCert(certs []x509.Certificate, cert x509.Certificate) x509.Certificate {
if len(certs) <= 1 {
return cert
}
result := cert
for _, item := range certs {
if item.Issuer.CommonName == cert.Subject.CommonName {
return findChildOfCert(certs, item)
}
}
return result
}
It is kicked off in a parent function via
childCert := findChildOfCert(cs, cs[0])
Where cs is a slice (Golang speak for array) of certificates much like “certs” in the recursive function. We pass it the list of certificates and the very first one.
On the first call it checks to see if this certificate issued any others in the list. If so, it calls the function again with the issued certificate( more child than the current ) and does the process over again.
When it cannot find a certificate issued by the currently iterated certificate (most child record), the for loop exits and it simply passes the original cert that the function was called with. At that point, the stack completely unwinds, passing the “answer” certificate all the way down. That result is provided to the childCert variable.
Validating the Cert
Golang provides a few options for validating the cert. Once you have the most child certificate, you can do something like the below.
for i := range cs {
if !cs[i].Equal(&childCert) {
roots.AddCert(&cs[i])
}
}
opts := x509.VerifyOptions{
Intermediates: roots,
}
opts2 := x509.VerifyOptions{
Roots: roots,
}
if _, err := childCert.Verify(opts); err != nil {
status = append(status, "Not Trusted By Root - failed to verify certificate: "+err.Error())
} else {
status = append(status, "Trusted Chain")
}
if _, err := childCert.Verify(opts2); err != nil {
status = append(status, "Not Valid(contiguous) - failed to verify certificate: "+err.Error())
} else {
status = append(status, "Valid Chain")
}
I load up a “roots” slice of the roots provided. I also exclude the child certificate from this. From there I perform two validations. One that the chain is trusted, meaning it rolls up to one that is trusted by the source used. The other validation is that the chain is valid. Is there continuity in the chain or is it broken. A chain can be valid but un trusted. Knowing the difference may help you in a rare case.
Stack Overflow
I actually found a stack overflow doing a regression test with a self signed certificate. The code above actually ended up comparing the certificate to itself over and over again and trying to keep going down the rabit hole. It ended up with the following
Luckily my unit testing caught this and this would have never gone to production due to that unit testing. If you’re not sure what unit testing is check out my article Unit Testing Golang on App Engine. The fix was simple though in that to make sure my recursive function doesn’t compare a cert to itself using the .IsEquals() function
Final Words
In this we walked through a useful use case for recursion and the background of the technology that needed it. I even provided some of the errors that happen when you fail to use recursion properly as I accidentally did!
If you have landed here, it is most likely because you attempted to implement the subject and ran into errors. In my use case, I ran into errors when splitting up my Golang app into packages and added unit testing to it. To compound it, I was automatically deploying via triggers with CloudBuild.
This article is not a “How To” on actual unit testing but getting it to work under this unique combination of tools. A great unit testing article I found is here – https://blog.alexellis.io/golang-writing-unit-tests/
How To Setup
Before we talk about what might have gone wrong, let’s talk about how to set this up properly. There has been a lot of confusion since Golang went to version 1.11 and started supporting modules. After supporting modules it appears using GOPATH is a less of a supported method. With that said some tools still look for it.
Directory Structure
To get a final state visual of my directory structure, it ended up as follows
Modules help with the dependency needs of Golang applications. Previously, private packages were difficult to manage as well as specific versioned public packages. Modules helps many of these issues. Here is a good article on modules – https://blog.golang.org/using-go-modules
The key to my issues seemed to be involving modules. Google App Engine’s migrating to 1.11 guide recommends the following.
The preferred method is to manually move all http.HandleFunc() calls from your packages to your main()function in your main package.
Alternatively, import your application’s packages into your main package, ensuring each init() function that contains calls to http.HandleFunc() gets run on startup.
My app’s directory is “tools.woohoosvcs.com” so I named the root, the same. Run “go mod init”
$ go mod init tools.woohoosvcs.com
go: creating new go.mod: module tools.woohoosvcs.com
dwcjr@Davids-MacBook-Pro tools.woohoosvcs.com % cat go.mod
module tools.woohoosvcs.com
go 1.13
Importing Private Packages
This then lets us to refer to packages and modules under it via something similar.
steps:
- id: subnetcalculator_test
name: "gcr.io/cloud-builders/go"
args: ["test","tools.woohoosvcs.com/subnetcalculator"]
#We use modules but this docker wants GOPATH set and they are not compatible.
env: ["GOPATH=/fakepath"]
- name: "gcr.io/cloud-builders/gcloud"
args: ["app", "deploy"]
Interesting tidbit is the “name” is a docker image from Google Cloud Repository or gcr.io
The first id runs the “go” image and runs “go test tools.woohoosvcs.com/subnetcalculator”. It seems the go image wants GOPATH but go test fails with it set so I had to set it to something fake.
It then uses the gcloud deployer which consumes app.yaml
Before implementing modules I could get it to run locally by importing ./subnetcalculator but then “gcloud app deploy” would fail.
2019/11/22 01:39:13 Failed to build app: Your app is not on your GOPATH, please move it there and try again.
building app with command '[go build -o /tmp/staging/usr/local/bin/start ./...]', env '[PATH=/go/bin:/usr/local/go/bin:/builder/google-cloud-sdk/bin/:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin HOSTNAME=ace17fcba136 HOME=/builder/home BUILDER_OUTPUT=/builder/outputs DEBIAN_FRONTEND=noninteractive GOROOT=/usr/local/go/ GOPATH=/go GOPATH=/tmp/staging/srv/gopath]': err=exit status 1, out=go build: cannot write multiple packages to non-directory /tmp/staging/usr/local/bin/start.
The error was vague but I noticed it was related to the import path. I tried moving the folder to $GOPATH/src and it seemed to deploy via “gcloud app deploy” but then failed via CloudBuild automated trigger.
------------------------------------ STDERR ------------------------------------
2019/11/21 21:31:39 staging for go1.13
2019/11/21 21:31:39 GO111MODULE=auto, but no go.mod found, so building with dependencies from GOPATH
2019/11/21 21:31:39 Staging second-gen Standard app (GOPATH mode): failed analyzing /workspace: cannot find package "tools.woohoosvcs.com/subnetcalculator" in any of:
($GOROOT not set)
/builder/home/go/src/tools.woohoosvcs.com/subnetcalculator (from $GOPATH)
GOPATH: /builder/home/go
--------------------------------------------------------------------------------
ERROR
ERROR: build step 0 "gcr.io/cloud-builders/gcloud" failed: exit status 1
It was like a balancing act with a 3 legged chair! Once I initialized the module though and adjusted the imports it worked great
Final Worlds
Here we worked through a best practice of using modules and an internal package to do automated build deploy on Google App Engine. Unit testing with automated deploys are important so that broken builds do not get pushed to production.
Over the past few days I put a few hours towards writing a web app to perform some basic CIDR subnet calculations. I wanted to both share the link to it ( https://tools.woohoosvcs.com/subnetcalculator/ ). I also wanted to walk through how I deployed it to Google App Engine.
What is a Subnet Calculator?
When provisioning subnets you typically need to determine your requirements. Determining the size of the subnet based on the number of hosts or what the mask will be can take a little bit of time. This is particularly so when you do not calculate this often. Subnet calculators help save some time on this.
Why did I do it?
I have written a few over the years. My early ones were written in C. Golang has a few libraries though in the new package to help with this. I had to use one the other day and figured why not quite my own. While doing this, why not share how to deploy your own app?
I have a main.go file and a static view.html file. Since I already have a default service I am calling this one service. That requires I use a dispatch for my custom domain name.
Don’t forget to add the custom domain to the Settings section!
The easiest way to get a Golang app running is as follow. The Google App Engine will run the main function in the main package. Google App Engine sets the PORT environment variable.
package main
func main() {
port := os.Getenv("PORT")
if port == "" {
port = "8080"
log.Printf("Defaulting to port %s", port)
}
http.HandleFunc("/subnetcalculator/", viewSubnetCalculator)
log.Printf("Listening on port %s", port)
if err := http.ListenAndServe(":"+port, nil); err != nil {
log.Fatal(err)
}
}
// This is the main page to view the Subnet Calculator
func viewSubnetCalculator(w http.ResponseWriter, r *http.Request) {
http.ServeFile(w, r, "./view.html")
}
The best practice for this would dictate a separate main.go importing your own package but if you were familiar enough with that, you may not need this article.
It is out of scope for this article but “http.HandleFunc(“/subnetcalculator/”, viewSubnetCalculator)” tells Golang that the “/subnetcalculator/” URI be directed to the “viewSubnetCalculator” function. That function simply displays the view.html file
From there we simply run the following. I have tools.woohoosvcs.com under the same parent as dispatch.yaml
cd tools.woohoosvcs.com
gcloud app deploy
cd ..
gcloud app deploy dispatch/dispatch.yaml
Securing Google App Engine
If you use a Web Application Firewall like Cloudflare, don’t forget to ACL GAE to only allow connections from it. Otherwise, it can be completely bypassed
Final Words
This was just a quick and dirty deploy of a golang app to Google App Engine. As with all the apps though, I run them through Cloudflare or at least some sort of Web Application Firewall.
I was watching cartoons with my 4 year old when I looked down at my phone and all of a sudden saw a water droplet icon. I dragged it town to read a message that indicated moisture was detected in my charging port. This couldn’t be, because I keep my phone nowhere near water. I had no accidents with it that could cause this, even though it is IP68 certified.
What is moisture detection?
IFIXIT has a great article on this. When reading up on this, that was the first link I clicked on. If you actually had your phone near water (which is perfectly acceptable on IP68), it has some great steps on rectifying it. In short, the phone has a sensor that can detect water/moisture and will disable the USB port while this detection is active. It helps to avoid a short or damage to the connectors. Apparently it detects this by monitoring resistance between pins.
Other issues like lint or other material could cause similar issues. The key to troubleshooting this is realizing it could be more than just water or moisture.
Other Things To Try
When I ran into this, I ran through the reboot and the hard reset without luck. I blew out with compressed air to no avail. When I did clean out my USB-C port with a toothpick I did find some debris but clearing this did not resolve the issue immediately.
What finally did it for me, which I was slightly surprised it worked was to power off the phone, plug it in and let it fully charge and then power on. I have a feeling the data for the port was cached and it just needed to clear after clearing debris out.
Here are a list of things to try, in this order.
Power off phone
Try to clean out with tissue and tooth picks
Reboot phone
Hard reset phone (power button and lower volume)
Blow out with compressed air but make sure none of the liquid comes out.
Power off and charge to 100% and then power back on
Go to Application Settings / USBSettings and clear cache and data and reboot
Factory Reset – Option of last resort because nobody likes doing this
Workarounds
If you are absolutely sure you do not have moisture or debris in there but need to charge, here are some options
Wireless charge ( this still works )
Power off the phone and try to charge powered off (if there is moisture in there this will potentially damage phone)
Sources
Here are a list of URLs I came across in my search. They may also help you further
I hope you find this article before you went ahead and factory reset or set in for repair. With that said, even if you cannot get this error to go away, wireless charging still works. You will just be mostly unable to charge via wired. On some phones you can still charge when the phone is powered off.
In this article we tackle VM orchestration. We I touched on in other articles, the desire is to dynamically spin up VMs as necessary. Some of the constructions in Google Cloud that are used are instance templates, instance groups, load balancers, health checks, salt (both state and reactor).
First Things First
In order to dynamically spin up VMs we need an instance group. For an instance group to work dynamically we need an instance template.
Instance Template
For this instance template, I will name it web-test. The name for this is important but we’ll touch on that later on.
For this demonstration we used CentOS 8. It can be any OS but our Salt state is tuned for CentOS.
As we touched on in the Cloud-init on Google Compute Engine article, we need to automate the provisioning and configuration on this. Since Google’s CentOS image does not come with this we use the startup script to load it. Once loaded and booted, cloud-init configures the local machine as a salt-minion and points it to the master.
Startup Script below
#!/bin/bash
if ! type cloud-init > /dev/null 2>&1 ; then
# Log startup of script
echo "Ran - `date`" >> /root/startup
sleep 30
yum install -y cloud-init
if [ $? == 0 ]; then
echo "Ran - yum success - `date`" >> /root/startup
systemctl enable cloud-init
# Sometimes GCE metadata URI is inaccessible after the boot so start this up and give it a minute
systemctl start cloud-init
sleep 10
else
echo "Ran - yum fail - `date`" >> /root/startup
fi
# Reboot either way
reboot
fi
The first thing is to accept new minions as this is usually manual. We then need it to apply a state. Please keep in mind there are security implications of auto accepting. These scripts do not take that into consideration as they are just a baseline to get this working.
In order to have these automatically work, we need to use Salt reactor which listens to events and acts on them. Our reactor file looks like this. We could add some validation, particularly on the accept such as validating the minion name has web in it to push the wordpress state.
{# test server is sending new key -- accept this key #}
{% if 'act' in data and data['act'] == 'pend' %}
minion_add:
wheel.key.accept:
- match: {{ data['id'] }}
{% endif %}
{% if data['act'] == 'accept' %}
initial_load:
local.state.sls:
- tgt: {{ data['id'] }}
- arg:
- wordpress
{% endif %}
This is fairly simple. When a minion authenticates for the first time, acknowledge it and then apply the wordpress state we worked on in our articicle on Salt State. Since we may have multiple and rotating servers that spin up and down we will use Google’s Load Balancer to point Cloudflare to.
Cloudflare does offer load balancing but for the integration we want, its easier to use Google. The load balancer does require an instance group so we need to set that up first.
Instance Groups
Instance groups are one of the constructions you can point a load balancer towards. Google has two types of instance groups. Managed, which it will auto scale based on health checks. There is also managed which you have to manually add VMs to. We will choose managed
This name is not too important so it can be any one you like.
Here we set the port name and number, an instance template. For this lab we disabled autoscaling but in the real world this is why you want to set all of this up.
The HealthCheck expects to receive an HTTP 200 message for all clear. It is much better than a TCP check as it can validate the web server is actually responding with expected content. Since WordPress sends a 301 to redirect, we do have to set the Host HTTP Header here, otherwise the check will fail. Other load balancers only fail on 400-599 but Google does expect only a HTTP 200 per their document – https://cloud.google.com/load-balancing/docs/health-check-concepts
And here you can see it is provisioning! While it does that, let’s move over to the load balancer.
Firewall Rules
The health checks for the load balancer come from a set range of Google IPs that we need to allow. We can allow these subnets via network tags. Per Google’s Healthcheck document, the HTTP checks come from two ranges.
Here we only allow the health checks from the Google identified IP ranges to machines that are tagged with “allow-health-checks” to port 443.
Google Load Balancer
Initial
This is a crash course into load balancers if you have never set them up before. It is expected you have some understanding of front end, back end and health checks. In the VPC section we need to allow these
Back End Configuration
Google’s load balancers can be used for internal only or external to internal. We want to load balance external connections.
We will need to create a back end endpoint.
Luckily this is simple. We point it at a few objects we already created and set session affinity so that traffic is persistent to a single web server. We do not want it hopping between servers as it may confuse the web services.
Front End Configuration
Health Check Validation
Give the load balancer provisioning a few minutes to spin up. It should then show up healthy if all is well. This never comes up the first time. Not even in a lab!
Troubleshooting
The important part is to walk through the process from beginning to end when something does not work. Here’s a quick run through.
On provisioning, is the instance group provisioning the VM?
What is the status of cloud-init?
Is salt-minion installing on the VM and starting?
Does the salt-master see the minion?
Reapply the state and check for errors
Does the load balancer see health?
Final Words
If it does come up healthy, the last step is to point your DNS at the load balancer public IP and be on your way!
Since Salt is such a complex beast, I have provided most of the framework and configs here – Some of the more sensitive files are truncated but left so that you know they exist. The standard disclaimer applies in that I cannot guarantee the outcome of these files on your system or that they are best practices from a security standpoint.
Yesterday I was playing a bit with Google Load Balancers and they tend to work best when you connect them to an automated instance group. I may touch on that in another article but in short that requires some level of automation. In an instance group, it will attempt to spin up images automatically. Based on health checks, it will introduce them to the load balanced cluster.
The Problem?
How do we automate provisioning? I have been touching on SaltStack in a few articles . Salt is great for configuration management but in an automated fashion, how do you get Salt on there? This was my goal. To get Salt Installed on a newly provisioned VM.
Method
Cloud-init is a very widely known method of provisioning a machine. From my brief understanding it started with Ubuntu and then took off. In Spinning Up Rancher With Kubernetes, I was briefly exposed to it. It makes sense and is widely supported. The concept it simple. Have a one time provisioning of the server.
Google Compute Engine
Google Compute Engine or GCE does support pushing cloud-init configuration (cloud-config) using metadata. You can set the “user-data” field and if cloud-init is installed it will be able to find this.
The problem is the only image that seems to support this out of the box is Ubuntu and my current preferred platform is CentOS although this is starting to change.
Startup Scripts
So if we don’t have cloud-init, what can we do? Google does have the functionality for startup and shutdown scripts via “startup-script” and “shutdown-script” meta fields. I do not want a script that runs every time. I also do not want to re-invent the wheel writing a failsafe script that will push salt-minion out and reconfigure it. For this reason I came up with a one time startup script.
The Solution
Startup Script
This is the startup script I came up with.
#!/bin/bash
if ! type cloud-init > /dev/null 2>&1 ; then
echo "Ran - `date`" >> /root/startup
sleep 30
yum install -y cloud-init
if [ $? == 0 ]; then
echo "Ran - Success - `date`" >> /root/startup
systemctl enable cloud-init
#systemctl start cloud-init
else
echo "Ran - Fail - `date`" >> /root/startup
fi
# Reboot either way
reboot
fi
This script checks to see if cloud-init exists. If it does, move along and don’t waste cpu. If it does not, we wait 30 seconds and install it. Upon success, we enable and either way we reboot.
Workaround
I played with this for a good part of a day, trying to get it working. Without the wait and other logging logic in the script, the following would happen.
2019-11-14T18:04:37Z DEBUG DNF version: 4.0.9
2019-11-14T18:04:37Z DDEBUG Command: dnf install -y cloud-init
2019-11-14T18:04:37Z DDEBUG Installroot: /
2019-11-14T18:04:37Z DDEBUG Releasever: 8
2019-11-14T18:04:37Z DEBUG cachedir: /var/cache/dnf
2019-11-14T18:04:37Z DDEBUG Base command: install
2019-11-14T18:04:37Z DDEBUG Extra commands: ['install', '-y', 'cloud-init']
2019-11-14T18:04:37Z DEBUG repo: downloading from remote: AppStream
2019-11-14T18:05:05Z DEBUG error: Curl error (7): Couldn't connect to server for http://mirrorlist.centos.org/?release=8&arch=x86_64&repo=AppStream&infra=stock [Failed to connect to mirrorlist.centos.org port 80: Connection timed out] (http://mirrorlist.centos.org/?release=8&arch=x86_64&repo=AppStream&infra=stock).
2019-11-14T18:05:05Z DEBUG Cannot download 'http://mirrorlist.centos.org/?release=8&arch=x86_64&repo=AppStream&infra=stock': Cannot prepare internal mirrorlist: Curl error (7): Couldn't connect to server for http://mirrorlist.centos.org/?release=8&arch=x86_64&repo=AppStream&infra=stock [Failed to connect to mirrorlist.centos.org port 80: Connection timed out].
2019-11-14T18:05:05Z DDEBUG Cleaning up.
2019-11-14T18:05:05Z SUBDEBUG
Traceback (most recent call last):
File "/usr/lib/python3.6/site-packages/dnf/repo.py", line 566, in load
ret = self._repo.load()
File "/usr/lib64/python3.6/site-packages/libdnf/repo.py", line 503, in load
return _repo.Repo_load(self)
RuntimeError: Failed to synchronize cache for repo 'AppStream'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3.6/site-packages/dnf/cli/main.py", line 64, in main
return _main(base, args, cli_class, option_parser_class)
File "/usr/lib/python3.6/site-packages/dnf/cli/main.py", line 99, in _main
return cli_run(cli, base)
File "/usr/lib/python3.6/site-packages/dnf/cli/main.py", line 115, in cli_run
cli.run()
File "/usr/lib/python3.6/site-packages/dnf/cli/cli.py", line 1124, in run
self._process_demands()
File "/usr/lib/python3.6/site-packages/dnf/cli/cli.py", line 828, in _process_demands
load_available_repos=self.demands.available_repos)
File "/usr/lib/python3.6/site-packages/dnf/base.py", line 400, in fill_sack
self._add_repo_to_sack(r)
File "/usr/lib/python3.6/site-packages/dnf/base.py", line 135, in _add_repo_to_sack
repo.load()
File "/usr/lib/python3.6/site-packages/dnf/repo.py", line 568, in load
raise dnf.exceptions.RepoError(str(e))
dnf.exceptions.RepoError: Failed to synchronize cache for repo 'AppStream'
2019-11-14T18:05:05Z CRITICAL Error: Failed to synchronize cache for repo 'AppStream'
Or you can use the console and paste it in plain text.
Don’t feel bad if you can’t find these settings. They are buried here.
Final Words
In this article we walked through automating the provisioning. You can use cloud-init for all sorts of things such as ensuring its completely up to date before handing off as well as adding users and keys. For our need, we just wanted to get Salt on there so it could plug into config management.
Today I got to play with the Salt win_wua module. Anyone that manages Windows servers, they know all about the second Tuesday of the month. the win_wua module can help greatly. I have recently been toying with Salt as mentioned in some of my other articles like Introduction to SaltStack.
Update Methodology
In the environments I manage, we typically implement Microsoft Windows Server Update Services (WSUS) but in a manual fashion so that we can control the installation of the patches. WSUS is more of a gatekeeper against bad patches. We approve updates immediate to only test servers. This lets us burn them in for a few weeks. Then when we’re comfortable, we push them to production. This greatly helped mitigate this conflict of Windows Updates – https://community.sophos.com/kb/en-us/133945
The process to actually install though is manual since we need to trigger the install. It involved manually logging into various servers to push the install button and then reboot. In my past complaints of this I was unable to find something to easily trigger the installation of windows updates.
Win_wua to the rescue
I originally thought I would need a salt state to perform this but the command line module is so easy, I did not bother.
salt TESTSERVER win_wua.list
TESTSERVER:
----------
9bc4dbf1-3cdf-4708-a004-2d6e60de2e3a:
----------
Categories:
- Security Updates
- Windows Server 2012 R2
Description:
Install this update to resolve issues in Windows. For a complete listing of the issues that are included in this update, see the associated Microsoft Knowledge Base article for more information. After you install this item, you may have to restart your computer.
Downloaded:
.....
It then spews a ton of data related to the pending updates to be installed. Luckily it has an option for a summary. Surprisingly we use the same “list” to install by setting a flag. The install function expects a list of updates you wish to install but we just want to install all pending ones.
Before we install, check out the summary output
salt TESTSERVER win_wua.list summary=True
TESTSERVER:
----------
Available:
0
Categories:
----------
Security Updates:
4
Windows Server 2012 R2:
4
Downloaded:
4
Installed:
0
Severity:
----------
Critical:
3
Moderate:
1
Total:
4
Ok so let’s install and only see the summary
salt -t 60 LV-PSCADS01 win_wua.list summary=True install=True
LV-PSCADS01:
Passed invalid arguments to win_wua.list: 'int' object is not callable
.. versionadded:: 2017.7.0
Returns a detailed list of available updates or a summary. If download or
install is True the same list will be downloaded and/or installed.
Well that’s no fun! Not quite what we expected. It appears its a known bug on 2017.7.1 and fixed. Update your salt minion or perform the manual fix it listed and run again!
salt -t 60 TESTSERVER win_wua.list summary=True install=True
TESTSERVER:
----------
Download:
----------
Success:
True
Updates:
Nothing to download
Install:
----------
Message:
Installation Succeeded
NeedsReboot:
True
Success:
True
Updates:
----------
9bc4dbf1-3cdf-4708-a004-2d6e60de2e3a:
----------
AlreadyInstalled:
False
RebootBehavior:
Never Reboot
Result:
Installation Succeeded
Title:
2019-11 Servicing Stack Update for Windows Server 2012 R2 for x64-based Systems (KB4524445)
9d665242-c74c-4905-a6f4-24f2b12c66e6:
----------
AlreadyInstalled:
False
RebootBehavior:
Poss Reboot
Result:
Installation Succeeded
Title:
2019-11 Cumulative Security Update for Internet Explorer 11 for Windows Server 2012 R2 for x64-based systems (KB4525106)
a30c9519-8359-48e1-86d4-38791ad95200:
----------
AlreadyInstalled:
False
RebootBehavior:
Poss Reboot
Result:
Installation Succeeded
Title:
2019-11 Security Only Quality Update for Windows Server 2012 R2 for x64-based Systems (KB4525250)
a57cd1d3-0038-466b-9341-99f6d488d84b:
----------
AlreadyInstalled:
False
RebootBehavior:
Poss Reboot
Result:
Installation Succeeded
Title:
2019-11 Security Monthly Quality Rollup for Windows Server 2012 R2 for x64-based Systems (KB4525243)
Of course, this is windows so we need a reboot. By default the win_system.reboot waits 5 minutes to reboot. With the flags below we can shorten that.
salt TESTSERVER system.reboot timeout=30 in_seconds=True
Salt State
If I wanted to automate the reboot after the update install, I could make this a state and check for the updates to trigger a reboot. In my scenario, I do not need it but if you want to try, check out this section for the win_wua states. The syntax is slightly different than the module we have been working with on this article.
Updating Multiple Server
If you want to update multiple servers at once you can do something like the following. The -L flag lets you set multiple targets as a comma separated
salt -t 60 -L TESTSERVER,TESTSERVER2,TESTSERVER3 win_wua.list summary=True install=True
salt -L TESTSERVER,TESTSERVER2,TESTSERVER3 system.reboot timeout=30 in_seconds=True
We could even set a salt grain to group these
salt -L TESTSERVER,TESTSERVER2,TESTSERVER3 grains.set wua_batch testservers
salt -G wua_batch:testservers win_wua.list summary=True install=True
salt -G wua_batch:testservers system.reboot timeout=30 in_seconds=True
Throttling
If you are running this on prem or just flat out want to avoid an update and boot storm, you can throttle it using “salt -b” as mentioned in Salt’s documentation.
# This would limit the install to 2 servers at a time
salt -b 2 -G wua_batch:testservers win_wua.list summary=True install=True
Final Words
This article is likely only good if you have salt in your environment somewhere but never thought about using it on Windows. It is a great tool at configuration management on Windows but most Windows admins think of other tools like GPO, SCCM, etc to manage Windows.