February 18, 2018

Sending Heroku logs to CloudWatch Logs

Filed under: ResearchAndDevelopment — Ryan Wilcox @ 10:26 pm

For early stage experiments I like deploying to Heroku. Heroku’s free model will spin down servers that haven’t gotten traffic in a while, and none of my free experiments have enough continuous load to require anything more.

When I set up a Heroku app I usually set up Cloudfront as a CDN and S3 as a blob storage. Heroku is already just a layer over AWS EC2, so these choices are staying in the ecosystem, and easy to set up too

The other week I realized I could set up logging the same way: send to Cloudwatch Logs. I’ve used Cloudwatch Logs for other production applications, and it’s an OK common denominator tool. Cheap, and I can hook up subscription filters to send it to some other microservice for further processing.

So I wrote heroku-cloudwatch-sync: you deploy the app, set up a Heroku drain, and it goes to the specified Cloudwatch Log Group and Log Stream.

This was my first time with AWS Lambda functions, and current state of the world allows pretty rapid prototyping by just editing the file through a web based editor. For more complex things I ended up writing a complex Makefile to take care of packaging and deploying the script, and Cloudformation templates to set up the infrastructure.).

When I do further serverless work I’ll have to check to see if the serverless frameworks have any wins here: I don’t know if they make the deployment story any better or worse than what I figured out.

Anyway, my heroku cloudwatch sync provides an easy and cheap way to get logs from Heroku into an easy and cheap log store!

January 9, 2017

Installing and exploring SpaceVim

Filed under: ResearchAndDevelopment — Ryan Wilcox @ 12:52 am


Posted on this blog as comments may be insightful. My other – more personal – blog doesn’t have comments, so this one will have to do. I’m hoping future comments will be helpful getting this running on Ubuntu 18 or something

I’m interested in text editors again. My main driver for the last 5(??) years only runs on OS X, but I’m not sure if my next laptop will be an OS X laptop. Maybe if I sit out the big couple-year USB-C transition….

Time to look into alternative text editors then. For the last 6 months I’ve been using Atom at work to write React. Atom feels like not a good long term fit for me, even if it’s React / JSX autocomplete stuff is exceptional (and the only reason I stuck with it).

Then SpaceVim caught my eye. It’s a set of configurations for Vim that Makes Vim Better (Again?). Decided to try it, and spent a couple afternoons setting up Ubuntu 16 and SpaceVim.

Spoilers: I like SpaceVim. Slightly undecided about Ubuntu).

Install Vim 8 on Ubuntu 16:

  1. Follow instructions on Vim 8 on Ubuntu page
  2. On Xbuntu (what I run on an old netbook), you may need to install vim-gtk3. However, this seemed to Just Work on my Ubuntu 16 install on my (VM) desktop machine.
  3. sudo apt install exuberant-ctags

Install SpaceVim on Ubuntu 16

  1. Install dein, paste what it tells you into your .vimrc, fire up vim and run the install command
  2. Install SpaceVim
  3. Install apt install fcitx
    3 Configure fcitx (run im-config. Yes, change your config, set to fcitx)

After all this, run vim. It will download some plugins. Now run vim something.js: it will download more/different plugins. Do this for other languages you use that SpaceVim supports, to cache their plugins.

Neat SpaceVim things

One of the reasons I persisted with SpaceVim was both because this group knows more about hthe state of the VIM art than I do, and because the GUI stuff looked really good. I hate remembering a ton of keyboard commands, so I like picking command options out of a menu, and SpaceVim, like Spacemacs before it, uses a neat type-ahead menu window to let me select what command I want to trigger. This certainly helps in a console app like Vim (even though I run with the menus on).

Some examples:

  • See / trigger all commands in a GUI search-ahead style menu: (normal mode) f+.
  • See all plugins installed: (normal mode): leader+lp

SpaceVim for Markdown

I wrote this blog entry in Spacevim – it has some neat features specifically for writers. :Goyo. This puts Vim into a distraction free writing mode. Toggle out by reexecuting the ex command.

I created a user defined command alias for this called :Focus, because I’ll never remember :Goyo.

Making SpaceVim yours

SpaceVim has some opinionated defaults, some of which I don’t like. Check out ~/.vim/autoload/SpaceVim/default.vim. A couple of these opinions are:

  1. Menu bar off.
  2. Hide Toolbar
  3. Mouse off
  4. Font (Deja Sans Mono for Powerline, 11pt)

Luckily, SpaceVim has a couple customiation points: put your customized settings in ~/.local.vim OR in .local.vim (allowing you to have project specific VIM settings).

To reverse these customizations:

  • Menu bar on: set guioptions+=m
  • Mouse: set mouse=a
  • Font: set guifont=Courier 10

Disabling plugins you don’t like (or can’t use)

SpaceVim lets you disable plugins you’re never going to, or can’t, use. For example, I found that the version of Vim I installed had Python 3 instead of Python 2. IndentLines requires Python 2, and that choice is a compiletime, exclusively.

We can disable the plugin per normal SpaceVim conventions by adding this line to our ~/.local.vim:

let g:spacevim_disabled_plugins=['identline']

Force disabling IndentLines

IdentLines is a pretty foundational bit of SpaceVim tech, so SpaceVim doesn’t actually like us doing this. We need to provide a stub implementation for IndentLinesToggle, the main Vim command in question. Also added to the ~/.local.vim file:

command! IndentLinesToggle ;

Loading your own Vim plugins

While most settings can be tweaked in the ~/.local.vim, this file is exec-ed before SpaceVim loads. This means that dein, or your Vim plugin manager of choice, hasn’t loaded yet.

I’ve added some new functionality to my local copy of SpaceVim, to load a .local.after.vim (in ~/ or ., just like the built in version). Here’s how:

To ~/.vim/vimrc add:

function! LoadCustomAfterConfig() abort
    let custom_confs = SpaceVim#util#globpath(getcwd(), '.local.after.vim')
    let custom_glob_conf = expand('~/.local.after.vim')
    if filereadable(custom_glob_conf)
        exec 'source ' . custom_glob_conf

    if !empty(custom_confs)
        exec 'source ' . custom_confs[0]

call LoadCustomAfterConfig()

This allows me to keep my customizations – wither they need to be loaded before SpaceVim, or after SpaceVim, in a seperate file, with only this one change to the core configuration.

One thing worth noting: if you add a new package to your .local.after.vim you’ll need to run :call dein#install() to download the pcakage. (perhaps this runs too late for the auto-downloader to pick up).


SpaceVim is pretty good. There are some neat things I didn’t know Vim could do, and some stettings that my previous Vim configuration had set also. So, like coming home to Vim, with the old familiar armchair and a nice hyper modern sofa too.

TL; DR: pretty neat, especially it’s use of Unity Vim plugins that give a nice type-ahead GUI for commands, buffers, etc etc. A welcome jumpstart back into the world of Vim, and better / easier than anything I could have put tocgether myself.

Feels like there’s optimization that could be done (it lags pretty hard on my almost 8 year old netbook… on the other hand, 8 year old netbook….)

But that’s SpaceVim, and how to set it up on Ubuntu 16. Up, up and away!

July 31, 2016

Single file web components on legacy projects

Filed under: ResearchAndDevelopment — Ryan Wilcox @ 10:33 pm

I’m constantly looking for new innovations that I can use on brown-field projects. You know the ones: you come in and there’s a large engineering investment in a project, so architectural decisions have been made, and to change those decisions would require large amounts of capital (both political and money capital).

So we have legacy code. But we can still learn from other modern, greefield projects, and apply new concepts from front end frameworks into our code.

We’ll examine one of this new front end ideas here: Single File Components.

Single Files Components: The State of the Art in Javascript Land

Traditionally CSS, HTML and Javascript have been in separate files. Separate languages, separate files. React.js and Vue.js have an interesting idea: that often these pieces interact with each other at a component level – such that having the three different pieces in the same file makes sense.

For example, you’ve been given a task to implement a toolbar. Normally you create a couple CSS classes, put them in a CSS file. Create an HTML structure, which maybe you put in a template somewhere . Documentation… somewhere. Some Javascript to tie it together, in a forth place.

But what if everything lived in the same file, instead of being spread out? Because ultimately, you’re building a toolbar – a single piece of UI.

React.js solves this with JSX – a Javascript preprocessor that lets you splat HTML in the middle of your Javascript. JSX is somewhat interesting, but assumes you buy into (the DOM building parts of) React.

React is pretty cool, and – in my mind – is easier to integrate with pre-existing sites than say Angular. If you want a new, complicated, individual component on your site, maybe React is worth looking into.

But Vue.JS has another solution to the “let’s keep stuff about a single thing together”.

Vue.JS solves this with Single File Components.

Vue.js Component files take cues from HTML: Javascript lives in a script tag, CSS lives in a style tag, HTML templates live in a template tag. This might remind you of your first web pages, before you worried about separating out everything (for your one page/one file “HI MOM” page with the blink tags…).

Bringing the state of the art back home, to legacy-ville

With a little help from Gulp and Gulp VueSplit we can split out Vue Component files into their individual files.

The awesome thing about Gulp-Vuesplit is that it’s a simple text extraction: pulling out the different parts of a component file then writing them out to separate files.

Let’s take a look at a simple component:

// components/clicker.vue
function autoClickers() {
    var clickers = [].slice.call(document.querySelectorAll('.clicker-clickers')) // turn into iterable array

    clickers.forEach(function(current) {
          current.addEventListener('click'function() { alertcurrent.innerHTML ) }, false);
    } )
document.addEventListener'DOMContentLoaded'autoClickers )

.clickers {

We can split this file up with a simple Gulpfile:

var vueSplit = require'gulp-vuesplit' )

var gulp = require'gulp' )
var pipe = require'gulp-pipe' )
var vueSplit = require'gulp-vuesplit' ).default

gulp.task('vuesplit'function() {
    return gulp.src('components/*.vue').
        pipevueSplit( { cssFilenameScopedtrue } ) ).

This generates two files – one for the JS and one for the CSS. (It would also generate a file for anything in a template tag, but we don’t have one here).

The generated JS is nothing remarkable – it’s just everything in the script tag, but the generated CSS is:

.clicker-clickers {

Notice the .clicker- prefix of our CSS class? That was generated by Gulp-VueSplit. It performs a kind of auto name-spacing, either with the file name prefixed, or with a unique hash suffix.

Now that we’ve generated seperate files, we can use them in an HTML page – even a simple HTML with no extra Javascript frameworks!

    <link rel="stylesheet" type="text/css" href="dist/clicker.css" />
    <script src="dist/clicker.js"></script>

    <p class="clicker-clickers">Hello world</p>
    <p>Not clickable</p>

January 16, 2016

Migrating Vagrant setup from Puppet 3 to Puppet 4 (manifestdir)

Filed under: ResearchAndDevelopment — Ryan Wilcox @ 10:11 pm

I like Puppet with Vagrant. Puppet 4 removed an option I really liked: manifestdir

You see, often when I’d start a greenfield project, I’d include a Vagrantfile so getting a new developer set up is one command. I’ve talked about this in the past.

Now-a-days I want to keep my puppet scripts in a folder, organized and slightly away from the main code.

Because I’m boring I call this puppet/.

manifestdir let me do that: shove everything into puppet/ and not see it. One simple flag passed into Puppet from Vagrant.

I knew Puppet 4 was going to remove manifestdir, but I could ignore the problem as long as Vagrant base boxes shipped with Puppet 3.7. Which they no longer do – it seems to be 4.x now-a-days.

Bitrot is rough in the DevOps world.

It also means I had to revisit territory from the first half of an earlier blog post.

I figured out how to solve my problem by abusing Puppet’s Environment system

Skip ahead and see the diff

In my Vagrant setup I’ll have Puppet modules specific to my app: telling Puppet to install this version of Ruby, that database, this Node package, whatever. I’ll also have third party modules: actually doing the heavy lifting of downloading the right Linux package for whatever, etc.

So, I’m building on some abstraction.

I use puppet module install to pull in third party modules. Puppet 4 puts them in a new place, I specifying the environment, to keep the cognitive dissidence low. I don’t strictly have to do this, but I think it’s good.

Note that we don’t want our third party modules to be in the same places as our specific modules: if installed them in the same place then we’d have to deal with these extra files in our source tree.

You see, when Vagrant starts up it creates a folder: /tmp/vagrant-puppet/ – it’s a shared folder so anything extra put in there shows up in our source directory.

So puppet module install installs third party modules in one place, and Vagrant installs our modules in another place.

Here’s where environments come in:

  1. We set our environment_path in the Vagrantfile to be ./. This is where Puppet will go looking for environments to load
  2. We set our environment – aka the environment Vagrant will tell Puppet to use – to a folder named puppet in the source directory. (remember that?)
  3. Puppet environments can contain three things: a config file, a modules folder and a manifests folder

Our config file sets the path for modules: we tell Puppet to look in our puppet/modules file, then look in the directory where puppet module install downloads its modules to, then look at the base module directory.

We need our config file because by default Puppet will look for modules in our environment’s module path, and the base module directory… and not where puppet module install puts things. (or so it seems…)

So that’s how you mis-use environments to get manifestdir “working” again.

May 24, 2015

Rapid System Level Development in Groovy

Filed under: ResearchAndDevelopment — Ryan Wilcox @ 2:53 pm

Introduction: Setting the stage

Lately I’ve found myself turning to Groovy for — oddly enough — system level development tasks. This is an unexpected turn of events, and seemmingly mad choice of technologies to say the very least.

Why Groovy?:

  1. I can’t assume OS (so Unix command line tools are out). One of my recent tasks involved something pretty complex, so shell script was out anyway.
  2. I can’t assume Ruby or Python is installed on any of these machines, but I can assume the JVM is installed.
  3. Groovy is a not bad high level language that I’ve also been using for other larger (non system level) programs.
  4. Since I’m on the JVM I can make executable jars, bundling up Groovy and all other dependancies into an easy to run program.

That last point is the real kicker. I want these programs to be easy to run. Even with that, as much as three days ago I wouldn’t have imagined doing programming like this in Groovy.

But this article isn’t about my poor choices in system languages: it’s about a workflow for small Groovy tools, from inception to ending up with an executable jar.

“But, but, what about?”

But, but, what about Go?“. I hear you, and I almost wrote my scripts in Go. Especially with the new easy cross-compilation stuff coming in Go 1.5. I expect to write tools like this in Go in the latter half of 2015 (or: whenever Go 1.5 is released + probably a couple months). I don’t have the patience to learn how cross compilation works today (Go 1.4).

But, but what… executable jars with Groovy?! Aren’t your jars huge?” Yeah, about 6MB a pop. I’ll admit this feels pretty outrageous for like a 50 line Groovy program. I’m also typing this on a machine with 3/4rds of a TB of storage… so 6MB is not a dealbreaker for me at current scale. But it does make me sad.

But, but what about JVM startup costs?” Yup, this is somwhat of a problem, even in my situation. Especially when in rapid development mode. This is another place where I almost wish I was writing in Go (cheap startup and compile times).

But this article is about rapid development in Groovy: going from an idea to releasing an embedable jar – maybe for systems programming, maybe for other things.

Fast, initial development of small Groovy scripts

As a newcomer to the Groovy scene I’ve Googled for this information, and found a couple of disjointed (and in some cases bitrotted) pieces on how to do these things (primarily packaging executable jars created from Groovy source). I hope another person (newcomer or otherwise) finds it useful.

Create your maven structure (which we’ll promptly ignore)

$ mvn archetype:generate -DarchetypeArtifactId=maven-archetype-quickstart -DinteractiveMode=false -DgroupId=com.wilcoxd -DartifactId=RapidDev (with values appropriate for your groupId and artifactId)

Dig into the generate src/main/java/com/wilcoxd and:

  1. make a new Groovy file
  2. Open the new Groovy file in your editor
  3. Add package com.wilcoxd as the first line of the file. Substitute com.wilcoxd with the groupId you specified in the mvn archetype:generate command.

While semantically, you should rename your java folder to groovy, that doesn’t seem to work with packaging process to create the executable jar. Just leave it be (I guess).

Rapidly develop your Groovy project (with two tricks)

The nice thing about Groovy is that you can write your Groovy program just like you would expect to write a Ruby or Python or Javascript: just type code into a file and Groovy will Figure It Out(TM).

Trick One: develop running your script directly

  1. cd into /src/main/groovy/com/wilcoxd/
  2. Write your script in the body of your .groovy file.
  3. Occasionally pop out to the command line and run groovy RapidDev.groovy (or whatever your script is called)

Groovy does a fair bit of work to execute your (even unstructured!) code. There’s some magic here that I don’t fully understand, but whatever.

$ vi RapidDev.groovy
.... type type type...

$ cat RapidDev.groovy
package com.wilcoxd

println "hey world"

$ groovy RapidDev.groovy
hey world

Crazy talk! No class needed, I don’t even need a public static void main style function!

Trick Two: dependency “management” with @Grab

If you find yourself needing a third party module, use @Grab to get it.

We’ll set things up with Maven, properly, later. Right now we’re concentrating on getting our program working, and turns out we need to make a RESTful API request (or whatever). We just need a third party module.

$ cat RapidDev.groovy
package com.wilcoxd

@Grab(group='com.github.groovy-wslite', module='groovy-wslite', version='1.1.0')
import wslite.rest.*

println("hello world!!!")

@Grab pulls in your dependancies even without Maven. I don’t want to introduce Maven here, because then I have to build and run via Maven (I guess?? Again, newbie in JVM land here…). Magic-ing them in with @Grab is probably good enough.

I’m sure Grab is not sustainable for long term programs. In fact, this isn’t a long term proposition: in fact we’re going to remove comment out grab the second we get this script done.

... iterate: type type, pop out and run, type type type...

$ groovy RapidDev.groovy

... IT WORKED! ...

We’re done coding! Now time to Set up pom.xml!

Yay, we’re done. Our rapid, iterative development cycle let us quickly explore a concept and get a couple dozen or a couple hundred lines of maybe unstructured code out. Whatever, we built a thing! Development in the small is nicer, sometimes, than development in the large: different rules apply.

But now we need to set up pom.xml, so it builds a jar for us.

Specify your main class property

Add this as a child of the <project> in your pom.xml:


Adjust the value of start-class as appropriate for your class / artifact ID from the mvn artifact:generate part of this.

Add Groovy and other third party modules you @Grabed into your <dependancies> section

Something like this (with a translated dependency, @Grab syntax to Maven syntax for the wslite module we previously grabbed above), to the <dependencies> section:


Once you have this in your pom, comment out the @Grab declaration in your source file

Add build plugin dependancies (another child of <project>):




            <transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">

Know your Groovy code will be groovy, unstructured and all

As mentioned before, Groovy goes to some magic to implicitly wrap a class around unstructured code. In fact, it will use the name of the file as the name of the class (so name your files like you would name Java classes!).

In our example, we’ve been editing RapidDev.groovy, which Groovy will wrap up in a class RapidDev declaration… or something. That package com.wilcoxd means Groovy will actually wrap our unstructured code into a class com.wilcoxd.RapidDev… which is a fine name and what we specified in our pom’s start-class property.


With a simple mvn package we can bundle our Groovy script up to an executable jar. A java -jar target/RapidDev-1.0-SNAPSHOT.jar runs it.

Which is awesome! I can take this and run it on any system with the JVM! I can write my “complex” systems level program once and run anywhere! I can reach deep into the Java ecosystem for spare parts to make my development easier, and still have a rapid development cycle one expects out of Python or Ruby.

Pretty neat!

March 2, 2015

A Deep Dive Into Vagrant, Puppet, and Hiera

Filed under: ResearchAndDevelopment — Ryan Wilcox @ 1:08 am

A Vagrant setup that supports my blog entry on “Vagrant / Puppet and Hiera Deep Dive”. Below is a reproduction of that article.


This weekend I spent far more time than I’d like diving deep into Puppet, Hiera and Vagrant.

Puppet is a configration/automation tool for installing and setting up machines. I prefer Puppet to other competitors (such as Chef) for Reasons, even though I also use Chef.

Hiera is an interesting tool of Puppet (with no equivalent I’ve found in Chef, at least that I’ve found): instead of setting variables in your configuration source, do it in YAML (or JSON or MYSL or…) files. This ideally keeps your Puppet manifests (your configuration sourcecode) more sharable and easier to manage. (Ever had a situation in general programming where you need to pass a variable into a function because it’s passed to another function three function calls down the stacktrace? Hiera also avoids that.)

However, documentation on Puppet, Hiera is pretty scarce – especially when used with Vagrant, which is how I like to use Puppet.

This article assumes you’re familiar with Vagrant.

My Vagrant use cases

I use (or have used) Vagrant for two things:

  1. To create local development VMs with exactly the tools I need for a project. (Sample)
  2. To create client serving infrastructure (mostly early stage stuff).

For use case #2, usually this is a client with a new presence just getting their online site ramped up. So I’m provisioning only a couple of boxes this way: I know this wouldn’t work for more than a couple dozen instance, but by then they’re serving serious traffic.

My goal is to use Vagrant and 99% the same Puppet code to do both tasks, even though these are two very different use cases.

Thanks to Vagrant’s Multi VM support I can actually have these two VMs controlled in the same Vagrantfile

First, general Vagrant Puppet Setup Tricks

File Organization

I set my Vagrantfile’s puppet block to look like this:

config.vm.provision "puppet" do |puppet|
  puppet.manifests_path = "puppet/manifests"
  puppet.manifest_file  = "site.pp"

  puppet.module_path   = "puppet/modules"

Note how my manifests and modules folder are in a puppet folder. Our directory structure now looks like:


Why? Vagrant, for me, is a tool that ties a bunch of other tools together: uniting virtual machine running with various provisioning tools locally and remotely. Plus the fact that the Vagrantfile is just Ruby means that I’m often pulling values out into a vagrantfile_config pattern, or writing tools or something. Thus, the more organization I can have at the top level the better.

Modules vs Manifests

I tend to one module per project I’m trying to deploy. By that I mean if I’m deploying a Rails bookstore app, I’ll create a bookstore module. This module will contain all the manifests I need to get the bookstore up and running: manifests to configure mysql, Rails, redis, what-have-you.

Sometimes these individual manifests are simple (and honestly probably could be replaced with clever hiera configs, once I dig into that more), and sometimes a step means configuring two or three things. (a “configure mysql” step yes, needs to use an open source module to install MySQL, but also may need to create a mysql user, create a folder with the correct permissions for the database files, set up a cron job to backup the database, etc)

I also assume I’ll be git subtree-ing a number of community modules directly into my codebase.

My puppet/manifests/ folder than ends up looking like a poor man’s Roles and Profiles setup. I take some liberties, but it’s likely the author is dealing with waaaaay more Puppet nodes than I’d ever imagine with this setup.

Pulling in third party Puppet modules

The third party Puppet community has already created infrastructure pieces I can use and customize, and has created a package manager to make installation easy. Except we need to run these package managers before we run Puppet on the instance!

Vagrant to the rescue! We can run multiple provisioning tasks (per instance!) in a Vagrantfile!

Before the config.vm.provision "puppet" line, we tell puppet to install modules we’ll need later:

    config.vm.provision :shell, :inline => "test -d /etc/puppet/modules/rvm || puppet module install maestrodev/rvm"

Because the shell provisioner will always run, we want to test that a Puppet module is not installed before we try to install it.

There are other ways to manage Puppet modules, but this simple shell inline command works for me. I’ll often install 4 or 5 third party modules this way, simply copy/pasting and changing the directory path and module name. As long as I’m before the puppet configuration block these modules will be installed before that happens.

Uninstalling Old Puppet Versions (and installing the latest)

This weekend I discovered a Ubuntu 12 LTS box with a very old version of Puppet on it (2.7). I have a love/hate relationship with Ubuntu LTS: The LTS means Long Term Support, so nothing major changes over the course of maybe 5 years. Great for server stability. However, that also means that preinstalled software that I depend on may be super old… and I may want / need the new version.

I ended up writing the following bash script:

#!/usr/bin/env bash
# This removes ancient Puppet versions on the VM - if there IS any ancient
# version on it - so we can install the latest.
# It is meant to be run as part of a provisioning run by Vagrant
# so it must ONLY delete old versions (not current versions other stages have installed)
# It assumes that we're targeting Puppet 3.7 (modern as of Feb 2015...)

INSTALLED_PUPPET_VERSION=$(apt-cache policy puppet | grep "Installed: " | cut -d ":" -f 2 | xargs)
echo "Currently installed version: $INSTALLED_PUPPET_VERSION"

if [[ $INSTALLED_PUPPET_VERSION != 3.7* ]] ; then
  apt-get remove -y puppet=$INSTALLED_PUPPET_VERSION puppet-common=$INSTALLED_PUPPET_VERSION
  echo "Removed old Puppet version: $INSTALLED_PUPPET_VERSION"

It assumes your desired Puppet version is 3.7.x, which should be good until Puppet 4.

I also have a script that installs Puppet if it’s not there (maybe it’s not there on the box/instance, OR our script above removed it). I got it from the makers of Vagrant themselves: puppet-bootstrap.

Again, added before the config.vm.provision :puppet bits:

config.vm.provision :shell, path: "vagrant_tools/remove_puppet_unless_modern.sh"  # in case the VM has old crap installed...
config.vm.provision :shell, path: "vagrant_tools/install_puppet_on_ubuntu.sh"

Notice that both these shell scripts I store in a vagrant_tools directory, in the same folder as my Vagrantfile. My directory structure now looks like:


Puppet + Hiera

Using Hiera and Vagrant is slightly awkward, especially since many of the Hiera conventions are meant to support dozens or hundreds of nodes… but we’re using Vagrant, so we may have one – or maybe more, but in the grand scheme of things the limit is pretty low. Low enough where Hiera gets in the way.


The way I figured out how to do this is create a hiera folder in our puppet folder. My directory structure now looks like this:


A reminder at this point: the VM (and thus Puppet) have their own file systems disassociated with the file system on your host machine. Vagrant automates the creation of specified shared folders: opening a directory portal back to the host machine.

Implicitly Vagrant creates a shared folder for manifest_path and module_path folders. (In fact, these can be arrays of paths to share, not just single files!!!)

Anyway, our hiera folder must be shared manually.

Note here that Vagrant throws a curveball our way and introduces a bit of arbitraryness to where it creates the manifest and module folders. You’re going to have to watch the vagrant up console spew to see where this is: with the vagrant_hiera_deep_dive VM the output was as follewed:

==> default: Mounting shared folders...
    default: /vagrant => /Users/rwilcox/Development/GitBased/vagrant_hiera_deep_dive
    default: /tmp/vagrant-puppet-3/manifests => /Users/rwilcox/Development/GitBased/vagrant_hiera_deep_dive/puppet/manifests
    default: /tmp/vagrant-puppet-3/modules-0 => /Users/rwilcox/Development/GitBased/vagrant_hiera_deep_dive/puppet/modules 

Notice the /tmp/vagrant-puppet-3/? That’s your curveball: it may be different for different VM names (but is consistant: it’ll never change)

So, create the shared folder in the Vagrantfile:

config.vm.synced_folder("puppet/hiera", "/tmp/vagrant-puppet-3/hiera")

Likewise, we’ll want to add the following lines to the puppet block

puppet.hiera_config_path = "puppet/hiera/node_site_config.yaml"
puppet.working_directory = "/tmp/vagrant-puppet-3/"

Important notes about the hiera config

It’s important that Hiera only likes .yaml extensions, not .yml.

It’s also important that yes, having both the node_site_data.yml and node_site_config.yml files do feel a bit silly, especially at our current scale of one machine. Sadly this is not something we can fight and win, but a limitation of the system. Hiera’s documentation goes more into config vs data files.

But also note that the node_site_config file points to node_site_data, via Hiera’s config file format.


I’ve been using Vagrant and Puppet at a very basic level a very long time (something like 5 years, I think). From best practices I’ve been using for years, to new things I’ve just pieced together today, I hope this was helpful to someone.

Explore this article more by looking at the Vagrant setup on Github

July 21, 2014

Using the CSV NPM Module

Filed under: ResearchAndDevelopment — Ryan Wilcox @ 2:39 pm

Today I had to use the CSV Node Module. It looked like the best and most mature of the alternatives.

The disadvantage about it is that the examples – especially for taking Javascript objects and getting a CSV string out – really leave something to be desired.

To combat this I wrote a simple Node program to illustrate how to write CSV files with the module. Enjoy!

CSV 0.4.0 example

For an example that uses the old – CSV 0.2.0 syntax – see below. Yes, I only provide an example of the callback syntax. You should really upgrade to CSV 0.4.0.

CSV 0.2.0 example


February 10, 2014

My Base Rails Setup, 2013 Edition

Filed under: ResearchAndDevelopment — Ryan Wilcox @ 10:32 am

In 2010 ago I wrote My Base Rails Setup.

I looked at it again and it looks pretty dated. The Rails community changes a lot in just 3 years.

Here are the tools I run into with frequency on Rails projects, and some notes where that has changed from the 2010 tools of choice.

Over the last 3 years I’ve done full time, then part time, consulting, mostly in Rails. The advantage here is that I see a lot of Rails apps written by other people. Occasionally I also get to start a Rails app from scratch.

So, my choices tend to be relatively conservative: tools that are either so common that I pick them because “everyone” uses them, or tools that introducing some potentially odd tool to the Rails app is negated by the clear win it has over the “competition” (be that existing gems or roll-your-own.

I’m very thankful that I’ve gotten to see a lot of Rails apps, of all sizes, in my nearly 5 years doing Ruby on Rails.

Reviewing Old Picks

Looking back at my tools from 2010:

  • will_paginate: I mostly see and use Katamari these days
  • formtastic: I’m still a big Formtastic fan, but I mostly am brought into already existing projects, and the court of public opinion has chosen simple_form.
  • DataMapper: in the dozens and dozens of Rails projects I’ve been on I’ve seen DataMapper once.
  • sentient_user There’s still no substitute, and I’m surprised this doesn’t get more love than it does.
  • show_for Not used enough in my opinion.
  • annotate_models: The community “solution” to “what columns does this table have in it?” appears to be, “open up schema.rb in another window”. So that’s what I do.
  • inherited_resources: Responders in Rails 3 cut down some of the cruft that I would use inherited resources for. I always assume Responders will be a good 80% solution and keep my eyes out for when to refactor the action into old style respond\_to blocks.
  • Data Migrations: There’s still no substitute – db/seeds.rb sucks for serious data migration tasks. There is a gem that provides similar abilities to the Rails plugin I linked to a few years ago: active_data_migrations. Sadly another place where worse (seeds.rb) seems to be better.
  • shoulda: RSpec won the war.
  • Timecop Still the best.
  • Macinist Factory Girl won. Thankfully FactoryGirl has gotten better syntax over the years (or I’ve gotten accustomed to it).
  • Mailtrap still maybe the best. An interesting up and comer is Mailcatcher, but I’m not 100% sold.
  • rr: Rspec’s mocking framework seems to have won the war here too.
  • jQuery’s timeago plugin Still the best.
  • BlueprintCSS: Now-a-days the winner is Twitter Bootstrap. I like the bootstrap-sass gem with the twitter-bootstrap-markup-rails gem abstracting away common constructs (alerts, tabs, etc).
  • utility_belt: All Hail Pry

So, we have better tools for a lot of those. The new picks are certainly on my “base rails tools” list.

New Base Gems/tools

Here are the new gems/tools I’ve either seen in common use, or always apply when I come into projects:

  • foreman every Rails app pulls in at least 1, if not 3 or 4, additional services. Redis, memcached, or solr are the usual suspects here, but it varies. Foreman lets me launch all those services with a simple command, stopping them all when I need to.
  • pry is the top debugger for Ruby. Especially when coupled with the pry-stack_explorer to explore the callstack and plymouth to open up pry when a test fails.
  • A Rails VM setup with Puppet. Every project I’m on I use this to create a VM and set up packages required for the system to operate.
  • cancan. While it needs some love, cancan is (sadly) still the authorization solution solution most used on Rails projects I see. I wrote a blog entry on organizing complex abilities. Cancan goes firmly in the “solutions in common use in apps I see”, not in the “tools I like using”.
  • rack-pjax. Useful if you’re using PJAX. If you need to optimize for speed you could write a method that sets or unsets the layout depending on if the request is PJAX or not… but rack-pjax is “drop in and you’re done”. I’m pretty sold on using PJAX for returning partial page content
  • rabl define your JSON/BSON/XML structure in a view file. jbuilder is the default in Rails 4, doing the same kind of thing, but I haven’t used it.

Right time, right place tools

Sometimes tools aren’t everyday tools. Sometimes the perfect tool, used in just the right place, is a godsend. Here is a list of tools like that, tools I’ll apply when the opportunity presents itself, although it only occasionally does:

  • pessimize writes gem versions to your Gemfile, so you can be very liberal with gem versions in development but then very conservative when time comes for production.
  • versioncake If I’m intentionally writing an API I want to version that API. Version Cake loads up the proper RABL or Jbuilder file for the specified version. (Yes, yes, see also Designing Hypermedia APIs).
  • SASS variables and the asset pipeline an approach when we need SASS global variables, working around limitations of the asset pipeline.
  • my ‘s’ helper. If I know I’m going to be on a project for a long while, and I join HTML sets together twice, I’ll bring this snippet in.


I’m excited to see what the next 3 years bring in Rails tools changes, and to see how this list stacks up then!

December 28, 2013

Rails Project Setup Best Practices

Filed under: ResearchAndDevelopment — Ryan Wilcox @ 1:26 am

As a long time Rails consultant, every new Rails project I come on to I go through the same dance:

  1. Does this project have a useful README.markdown? Maybe with setup instructions?
  2. No? Just Rail’s default README.markdown? Shucks.
  3. Does this project have a database.yml file?
  4. No? Great. Does this project have a sample database.yml file I can copy and get sane defaults for this project?
  5. Does this file have a .ruby-gemset file?
  6. Yes? Great. Does this .ruby-gemset have sane defaults, or is the gemset named project or something non-obvious?
  7. Is there a redis or solr config file I need to edit?
  8. Do I need to set up machine specific variables or settings anywhere? (For example, in .env, config/settings.yml, or config/secrets.yml, or even just in environments/development.rb?).
  9. No? Ok, great, does the app assume I’m running it on OS X with packages installed via Homebrew? (Because I’m usually not. And if I am running your project on bare metal, I prefer Macports.)
  10. Is there a Procfile?
  11. Yes? Great. Does that Procfile work for development, or is it a production-only Procfile?
  12. No Procfile? What services do I need to run to use the app? Redis? Solr? Some kind of worker queue mechanism?
  13. How do I run all the tests?
  14. rake db:setup
  15. rake spec
  16. Did something fail because I missed a config setting or some service isn’t running? If true, fix and GOTO 15.
  17. Awesome, it worked.
  18. Are there Cucumber or Selenium tests?
  19. Run those, somehow.
  20. Fire up Rails server
  21. When I visit the development version of the site, is there a special account I log in as? Or do I register a user account then browse through the code figuring out how to make myself an admin or registered user or what, then do that via rails console?

You could split these questions up into configuration questions and runtime questions. This blog entry will show best practices I try to install on (almost) every Rails project I touch.


Runtime is the easiest, so I’ll tackle it first.

In my mind this is mostly solved by Foreman and a good Procfile, or set of Procfiles.

Setup with Procfiles and Foreman

A Procfile will launch all the services your app needs to run. Maybe you need Solr, Redis, the Rails server, and MongoDB up: you can set up a Procfile to launch and quit those services all together when you’re done.

Heroku uses Procfiles to make sure everything’s up for your app. Heroku’s usually my default, “pre-launch to mid-traction point” hosting choice because of its easy scaling and 2 minute setup process.

Heroku also provides tons of addons, adding features to your app. Sometimes these features are bug reporting or analytics, and sometimes the Heroku addons provide additional services. Two addons that do this are Redis 2 Go, and ElasticSearch from Bonsai.io.

If an app uses Redis, is deployed to Heroku, and uses the Redis 2 Go addon, then the app doesn’t need to have Redis in its Procfile.

However, when I’m developing the app I need these services running locally.

Foreman takes care of this, reading a Procfile (a default Procfile or one I specify) and firing up all of the services just like Heroku does. Don’t Repeat Yourself in action.

When I’m setting up a project that’s getting deployed to Heroku I create two Procfiles: one Procfile and one Procfile.development.sample. (I add Procfile.development to the .gitignore file in Git).

The Procfile.development.sample is important for two reasons:

  1. It lists all the services I’ll need to be running as a developer
  2. It can be used as is, or if a developer has say Mongo already running via a startup script, but the Procfile.development.sample tries to launch it again, they can copy the file, rename it to Procfile.development, and remove the line about starting up Mongo.

When I’m not deploying to Heroku I’ll still create a Procfile.development.sample for ease of starting up servers.

Running all the tests

Testing is big in the Rails world, and there’s a lot of ways to test Rails apps. RSpec with maybe Cucumber is usually what I see, but sometimes there’s a Selenium suite or Steak or something.

When I’m setting up a Rails project I write a quick Rake task to run all the test suites. For RSpec + Cucumber it looks something like this:

namespace :test do

  desc "Run both RSpec test and Cucumber tests"
  task "all" => ["spec", "cucumber"]

As a developer on the project – especially a new developer – I just want to type in one command and know I’ve tested all the app there is to test.


When I’m setting up a project I create sample files for each configuration file that might be modified by a developer. So, files with names like:

  • config/database.sample.yml
  • ruby-gemset.sample
  • config/redis.sample.yml
  • .env.sample
  • config/secrets.sample.yml

But this still doesn’t solve our song and dance from the beginning of the blog entry: there’s still a lot to configure, even if I have sample files to copy and rename!

Like any good geek, I’ve replaced this frustration with a small shell script (template). Each project is different, and so each bin/install.sh will look a little different, but here’s a recent one I made for a non-trivial project:


# If you want to go fancier, see some prompts in
# <http://stackoverflow.com/questions/226703/how-do-i-prompt-for-input-in-a-linux-shell-script>

if [ ! -e Procfile.development  ]
    cp Procfile.development.sample Procfile.development

    echo "Do you wish to edit Procfile.development?"
    select yn in "Yes" "No"; do
    case $yn in
        Yes ) $EDITOR Procfile.development; break;;
        No ) break;;

if [ ! -e config/database.yml  ]
    cp config/database.yml.example config/database.yml
    echo "See the default database.yml?"
    select yn in "Yes" "No"; do
    case $yn in
        Yes ) cat config/database.yml.example; break;;
        No ) break;;

    echo "Do you wish to edit this database.yml?"
    select yn in "Yes" "No"; do
    case $yn in
        Yes ) $EDITOR config/database.yml; break;;
        No ) break;;

if [ ! -e config/redis.yml  ]
    cp config/redis.yml.example config/redis.yml
    echo "Do you wish to edit redis.yml?"
    select yn in "Yes" "No"; do
    case $yn in
        Yes ) $EDITOR config/redis.yml; break;;
        No ) break;;

if [ ! -e .ruby-gemset ]
    echo "Do you wish to create a .ruby-gemset file and edit it?"
    select yn in "Yes" "No"; do
    case $yn in
        Yes ) cp .ruby-gemset.copy .ruby-gemset; $EDITOR .ruby-gemset; break;;
        No ) break;;

if [ ! -e .env ]
    cp .env.sample .env
    echo "Do you wish to edit .env?"
    select yn in "Yes" "No"; do
    case $yn in
        Yes ) $EDITOR .env; break;;
        No ) break;;

It’s not the prettiest example of a shell script ever, but it’s easy and fast to modify and should run in all shells (I avoided fancy zsh tricks, even though zsh is my primary shell).

Run this and it will guide you through all the files you need to copy, asking you if you want to edit the config file when it’s in place. For opinionated files, like .ruby-gemset, the script will ask what you want to do.

Each of my sample files contain sane default values, which should work for the developer, but they don’t have to.

Thoughtbot has some initial thoughts on project setup too (they call it bin/setup), but they take a slightly different approach (and automatically set up different things). You could use there shell script along with mine if you wished.

My Ultimate New-To-This-Project Developer Experience

Since we’re talking about developer automation and project setup, I’d like to share my own dream experience:

  1. checkout code from Git repo
  2. “Oh, look, a Vagrantfile”
  3. $ vagrant up
  4. (15 minute coffee break while Vagrant boots up box and provisions it all for me, including Ruby setup)
  5. (During 15 minute coffee break, glance through the project’s README.markdown, see mention of running bin/install.sh)
  6. $ vagrant ssh
  7. $ cd $PROJECT
  8. $ bin/install.sh
  9. (Answers questions in install.sh and gets settings tweaked for this VM)
  10. $ rake db:setup
  11. $ foreman start -f Procfile.development
  12. $ rake test:all in a new tab. All tests pass.

Low barriers to entry, very automated project setup – help me get it set up right the first time. Help me be more productive faster.

You notice I called rake db:setup which creates a new database, loads the schema, and loads db/seeds.rb. Replace this step with “run migrations from 0” and “load in initial bootstrap data” if you wish. I’m usually in the “migrate from 0” camp, but usually find myself in a minority.

Anyway, If you compare the top list with this list you’ll see that the steps followed are very different. The first set of steps is hesitant: does this thing exist? Do I need to do X? The second set of setups is confident: The machine set this up for me and so hopefully everything is right.

In Summary

Here’s the best practices to take away from this blog entry:

  1. Consider creating a Vagrant setup for your project, including provisioning.
  2. Documentation in the README.markdown with basic “how to setup this project” instructions.
  3. Sample config files with values that are opinionated, but since they’re copied into place, easily changable.
  4. A bin/install.sh script like mine, or bin/setup script, like Thoughtbot’s.
  5. A Procfile just for developers
  6. A way to run all the tests, easily
  7. Load just enough sample data on a developer’s machine to allow them to get to major sections of your app without having to learn how to use it on day 1.

The easier it is for a developer to get up to speed on your project, the faster they can start getting real work done!

October 28, 2012

Modern Cocoa Concurrency / Asynchronous Processing Patterns

Filed under: ResearchAndDevelopment — Ryan Wilcox @ 9:35 pm


In the last several years Apple has been pushing us Cocoa developers away from thread based concurrency. Thread based concurrency tends to be bug-ridden, ceremony filled, and just unpleasant code.

There have also been some changes in the wider world of development that have changed how we as developers thing about concurrency.

Today there are six different patterns to choose from: NSOperationQueue, callbacks, futures, subscribers, GCD, and making your async APIs synchronous anyway.

This is meant to me a world-wind tour/survey of the various techniques Cocoa developers use to solve the thorny problems of asynchronicity and/or concurrency.


A really nice Cocoa class for doing The Right Thing when given N tasks to perform. I like it with blocks.

It’s a simple API which means it’s great for quick things, but it also lacks power. But sometimes you don’t need power!

You can also limit the number of operations running at a time. For example to limit the number of concurrent downloads.


Imagine some code that looks like this:

[myObject makeANetworkRequestWithCompletionHandler: ^(id result, NSError* err) {

  NSLog(@"I am done");

A lot of network APIs/Frameworks for Cocoa look this this (cough, coughFacebookcough, cough).

Callbacks are fine when you can decouple time from your program. “Whenever I get this data that’s fine, it’s just background data anyway”. Except imagine if it’s a list of documents stored on some server somewhere – parts of your UI might want that information to display data in a table or a count.

Yes, in some cases you can use interface feedback to tell the user to wait… Certainly a more responsive UI is better. But if you’re waiting for data from the network in order to parse and make another request on the network, you end up with callback/block soup.


Getting an object now that I can check and get my result “later” is appealing to me.

The most promising is DeferredKit – it returns you a defered object which you can easily add or chain callbacks too.

RESTKit uses the future pattern and allows you to assign callback blocks to properties on the object you are creating. These blocks are then called as part of the RESTKit object callback chain-of-events. Example:

record.onDidLoadResponse = ^(RKResponse* response) {
  // ...

Mike Ash also has a Friday Q&A about Futures

Reactive (Subscribers) model

Github has a ReactiveCocoa framework that does a bunch of crazy allowing you to subscribe to event completions and event changes.

I don’t yet completely understand the framework or mindset behind this. However, when my projects require a more comprehensive, and unified approach to async or concurrency – and I have a week to spare to properly grok this framework – I’ll be looking into ReactiveCocoa.

Grand Central Dispatch

Grand Central Dispatch is Apple’s approach at C level concurrency. There’s a lot in GCD, and I usually prefer using NSOperationQueue or something higher in the stack, but sometimes you really want a low level tool.

Like today. I needed to make 25 HTTP requests to a server, parse them, and let some other part of the app know when these requests had been parsed.

Grand Central Dispatch has two tools I tried:

  1. dispatch_async(dispatch_get_main_queue(), ^{...});
  2. dispatch_group_async
  3. dispatch_semaphore


When given a task (the code in a block) it will run it at some point in the future. Coupled with dispatch_get_main_queue you’ll be able to run things on the main loop at some point in the future.

Wait, why would you want to do that? Answer? cough, coughFacebookcough, cough.


You can create a group of dispatched tasks using dispatch_group_create(), dispatch_group_async(). You can even wait for all the tasks in the queue to complete with dispatch_group_wait().


Apple also provides a cheap, reference counting semaphore. dispatch_semaphore_create, dispatch_semaphore_signal and dispatch_semafore_wait are the important methods here.

You can even Construct your own dispatch_sync out of these primitives

dispatch_group_wait(), dispatch_semaphore and the current thread

the *_wait methods have one disadvantage: they are blocking. So, use case here is to do something like:

dispatch_group_t group = dispatch_group_create();

    dispatch_queue_create("MINE", DISPATCH_QUEUE_SERIAL),
    ^ {
      NSLog(@"hello world");

dispatch_group_wait(group, DISPATCH_TIME_FOREVER);

Notice that dispatch_group_async above creates a new dispatch queue. This means that “hello world” will be printed out in some other thread. The current thread will halt, waiting for your dispatch to be complete, then it will print out “done”.

Now, this behavior is bad if you’re already on the context you are dispatching to. For example, imagine a function that is called on the main thread, like a button outlet.

- (IBOutlet) handleButton: (id) sender {
  dispatch_group_t group = dispatch_group_create();

  dispatch_group_async(group, dispatch_get_main_queue(), ^{

    NSLog(@"Hello world");

  dispatch_group_wait(group, DISPATCH_TIME_FOREVER);

This will actually cause a deadlock – you’ve told the main thread to sit there, waiting, until something finishes, but you’ve also scheduled something to happen on the main thread. Nothing will happen.

Apple’s documentation isn’t exactly explicit about this, and I had to prove this to myself with some experimentation and some stack overflow reading. However, all is not lost.

Synchronous APIs in an Async World

My main reason for doing this is interacting with Facebook. Facebook’s network APIs are cranky when they’re not on the main thread.

I also really want to avoid the ceremony that comes with asynchronous callback handling. Setting up something to process the data whenever it shows up, some kind of wait indicator for the users, and potentially confused users when they look at an unexpectedly blank table (because we haven’t gotten the data yet).

So, for very limited but very “save myself a lot of typing and re-architecting of this whole chunk of code” situations I actually want to stop the flow of code at the callsite. I don’t want to block a thread waiting for results (especially since I may need to execute GCD dispatches on this thread)… but I do want to defer execution of the next lines of code until the dispatch is complete.

I ended up implementing ASYNC_WAIT_ON_SAME_THREAD for exactly this purpose.


In order to wait for an operation to complete we’ll want to avoid using GDC semaphores or waiting mechanisms. I eventually came up with this implementation of ASYNC_WAIT_ON_SAME_THREAD

void ASYNC_WAIT_SAME_THREAD(dispatch_queue_t queue, dispatch_block_t users_block) {

    __block BOOL allDone = NO;

    dispatch_async(queue, ^{

        allDone = YES;

    while( !allDone ){
        // Allow the run loop to do some processing of the stream
        [[NSRunLoop currentRunLoop] runUntilDate:[NSDate dateWithTimeIntervalSinceNow:0.5]];

Yes, it polls for a result. It’s inelegant, but the best solution I could come up with given the constraints.


Although things are happening in the main thread (your dispatches are running) the UI might feel very unresponsive to the user if you have many small dispatches lined up.

In fact, if you have an operation that takes 1 second, the user will see that lag. They press a button, your code uses ASYNC_WAIT and the app just sits there for 1 second. So ASYNC_WAIT is not a panacea, but in certain cases the code advantages outway the UI disadvantages.

Consider use of ASYNC_WAIT a code smell. It’s not that use of it is bad, but it represents a potentially dangerous area of code.

Then again, code smells are sometimes OK. IN the C++ world we have reinterpret_cast which tells the compiler, “Trust me, I know that this bit of data is actually this thing”. Use of a few reinterpret_casts to get yourself out of hairy situations is fine, but not every nail should be screwed with that particular socket wrench.


I hope this survey of the concurrency / asynchronous design patterns in Cocoa has been useful. I look forward to more and more work being done in this area by the community.

Next Page »