Monday, September 30, 2013

Learning to Build a Server with Puppet in (almost) a Day

My development environment is a complete mess. I have multiple versions of various stacks from Ruby to NodeJs and I've run into so many issues trying to keep things compatible that I need to start splitting environments up by project. Its something I knew I would have to do but just didn't want to invest the time setting it up. What I needed was a easy way to spin up a base environment and, by project, describe what the environment should look like. Some research led me to several options like Vagrant, Razor, Capistrano, Chef, and Puppet. Ultimately, I chose to keep it simple by using Puppet in standalone mode to describe the configuration and use a combination of bootstrapping and git to deploy a new environment. As a side benefit, my fellow developers could easily replicate the same environment and these configurations could serve as a basis for building staging/production servers. In theory, the challenge should have been learning the syntax of a Puppet manifest including what modules existed that would do a lot of the work for me. However, as it turns out, I had a lot of problems just getting the puppet command line program to even run. In total, I spent about 3 days building a base VM image, installing the minimum environment to run Puppet, and writing my manifest.

Step 1: Build A Base Box

Even if I did decide to use Vagrant, I'd want to build my own base box image as a starting point. In my VirtualBox, I created a new instance and mounted the Debian Wheezy ISO I downloaded, spun it up, and followed the prompts. Since this is going to be my clean instance that I can clone and provision using Puppet, I did very little after finalizing the install. However, I did go in and setup sudo to allow no passwords for admin users, installed guest additions, and configured SSH with my public key. At this point, I shutdown the base instance and cloned it in the VirtualBox UI, setup the network to my liking, and started it up so I could bootstrap Puppet onto the instance.

Step 2: Bootstrap Puppet

At this point, I was searching for the best way to get my Puppet manifests to the server so they could be run. I had them checked into source control with the rest of my application. But the new server doesn't have Git or Puppet installed. All I have is SSH. Well, Capistrano is designed to push things out over SSH, why not look at what others are doing with it? After a little searching I found an article describing exactly what I needed. After looking at the setup, I decided to forgo Capistrano at the moment and just copy the bootstrap shell script to the new server myself with scp. I also didn't copy the Puppet directories/manifests since I had them checked in and was planning to just clone the repo from the server.

So far, I only have a basic manifest, I figured I'd continue to tweak it once I got a clean box up and running. If I could get the environment in place, I should be good to dig into learning Puppet more, right? Well, not so fast - I still need to get it installed. In the bootstrap script described in the article, it installs RVM with 1.9.3-p125 as the default Ruby and installs Puppet within this context. This did not work and I spent a day trying to figure out why. When it finished installing, I ran librarian-puppet to pull all the modules I needed. This failed with an error about a bad version number. I thought it was weird because I could run the same command on my other box without any issue (although, it had all kinds of Rubies installed on it so it was not a good control). Knowing that my current environment was a mess, I figured something was an issue with the Ruby install so started there. After following the trace stack back into librarian-puppet, I found the line where it was calling puppet --version - apparently to get the version of Puppet. That was resulting in the error. So, I ran it on the command line and got the following output:

# puppet --version

See 'puppet help' for help on available puppet subcommands

Ok, it still printed the version, but also dropped in that error message. Clearly, that wasn't going to parse into a valid version string. So, I tried:

# puppet help

See 'puppet help' for help on available puppet subcommands

Well, something is definitely wrong and I'll save you the long story of everything I tried. Eventually, I decided to completely remove RVM and install Ruby from the Debian packages. That got me ruby version 1.9.3-p194 (and its not the patch version either - I tried _many_ versions of Ruby via RVM). Installed all the gems again, held my breath, and ran librarian-puppet...

And it worked.

Step 3: Write a Puppet Manifest

So by now, I was too frustrated to deal with any issues with the manifest. Using the reference and modules from the forge as a guide, I cobbled together the various things I needed for my environment. After suffering through various parse errors in both my manifest (and unfortunately some modules from the forge). I was able to apply my manifest with puppet in standalone mode. There were definitely issues and I'm still learning the basics. My biggest problem was using the apt module to setup the sources for various packages like PostgreSQL and Nginx. First, it would not run before the module that actually installed it, second, it had issues retrieving the authorizing key, and, finally, the sources were not updated so the packages could be installed.

To solve these problems, I had to dig a little bit into the APT module code to understand how it worked. Once I got a feeling for what it expected as inputs and what each part of the module provided, I was able to properly define my resources. Here's what I ended up with for installing PostgreSQL:

apt::source { 'postgresql':
   location   => '',
   release    => 'wheezy-pgdg',
   repos      => 'main',
   key_source => '',

class { 'apt::update': }

class { 'postgresql':
   version                    => '9.2',
   require                    => Class['apt::update']

I solved my first problem by using the require parameter on the postgresql class to point to the apt::update resource. Now, you might think that the apt::update class needs a require or maybe a before on the postgresql resource. However, inside the APT module, an anchor is defined for the apt::update class. If you add another a require/before, you'll create a circular dependency.

My next problem related to obtaining the key to confirm the signed package from the PostgreSQL source. In the APT module's documentation, there's this example:

apt::source { 'puppetlabs':
   location   => '',
   repos      => 'main',
   key        => '4BD6EC30',
   key_server => '',

Attempting to use that example just resulted in an error. What I wanted was the same wget command you see on the PostgreSQL site. Again, I dug through the module source and found an if statement:

  if $key_content {
    $method = 'content'
  } elsif $key_source {
    $method = 'source'
  } elsif $key_server {
    $method = 'server'

Which further down the source is used to select the command:

  $digest_command = $method ? {
    'content' => "echo '${key_content}' | /usr/bin/apt-key add -",
    'source'  => "wget -q '${key_source}' -O- | apt-key add -",
    'server'  => "apt-key adv --keyserver '${key_server}' ${options_string} --recv-keys '${upkey}'",

So, I didn't need the key or key_server parameters - just the key_source to trigger the desired command.

My final problem was to get the package sources to update so apt-get would actually find the package. As you can see, I added the class { 'apt::update': } explicitly in the manifest. The source resource only defines a dependency on the update but doesn't actually call the update. This makes sense since you may want to define several sources and then do one update. Since the source defines the dependency, you just need to ensure its called and then create a dependency to any modules/package resources that might depend on it later.

Closing Thoughts

Overall, Puppet is fairly easy to start using. With the addition of the community contributed modules that help configure common packages its becomes even easier to jump in. As these mature, it will become easier to describe most of a systems configuration through a Ruby/JSON-like syntax. I needed to install and configure Nginx, NodeJS, PostgreSQL, Redis, MongoDB, god, and Samba. Everything other than god had a module available that I could install via librarian-puppet. Each module's documentation provided enough examples that I could copy-and-paste to get started. When in doubt, I could just open the module's source and, generally, figure out what I needed to do to fit my needs. After a day of work, I can now build out a working environment in about 10 minutes. Now all the excuses I have about it taking too long to setup a new VM are gone and I can quickly bring up a new project, or spin up testing instances with minimal effort.