Best Practices Patterns

Patterns: don’t mess up the prod db!

With 12factor style environment configs, it’s a very easy to accidentally connect to your production database when you think you’re connecting to dev. Here’s a simple guard you can add to make sure your  DATABASE_URL isn’t somehow pointed to someplace it’s not supposed to (assuming you’re using Amazon AWS):


if [[ ${DATABASE_URL} == *"amazonaws"* ]]; then exit -1; fi


if 'amazonaws' in os.environ['DATABASE_URL']:
   exit('Cannot be run against a production database')


if 'amazonaws' in settings.DATABASES['default']['HOST']:
    raise CommandError('Cannot be run against a production database')

(thanks to x110dc for the bash syntax and 12factor link)

You didn't say the magic word

Mental Note: Add Category

TBT: Doom mooD

doom mood

I posted about this a long time ago, but I was cleaning my hard drive and found this screenshot and I need to brag about how awesome Doom mooD was:

  1. The face animated
  2. The face got bloodier the higher the CPU usage got
  3. The expression changed depending on how quickly CPU usage changed
  4. Doomguy’s face turns to look at your mouse pointer when it enters the widget
  5. A testament to the original id design, this little bar packs a ton of information
  6. iddqd
Demo of the face animation of Doom mooD

2006 me was cleverer than I remember. I was mindful of overhead, made my own crude bitmap font, and was doing some pretty clever JavaScript for 2006.

I even found my original README:

This widget was born from boredom. I saw a Doom screenshot and thought
to myself that the status bar of Doom would make an entertaining
system monitor, and so I made it in Konfabulator.

I never though it would be all that useful, but I must say, I surprise
even myself. Turns out I use this widget ALL the time. And watching
the face is quite entertaining. So entertaining in face I made a
special face-only mode for the widget. The face mode is excellent
because you can put the face anywhere and he blends in. And a quick
glance tells you how your system is doing. (HINT: blood = bad).

I must warn you, one side effect of using this widget is that I
started to stress my system on purpose, just to get the guy to make
cooler, bloodier, faces.

== Usage Tips ==
To the right of the memory indicator, there are three indicators.
The first one shows if you have a CD or DVD inserted. Clicking will
open it.
The second one will show you if your volume is muted.
The third one will give info about the Trash/Recycle Bin. Clicking
will open it.

== Weapon Slots ==
These are the indicators labelled '2' through '7' directly left of
the face. In Doom mooD, they are used as quick launchers.

To add an entry, simply drag and drop what you want to launch onto
a number. The widget will remember this entry between sessions.

To clear an entry, right click and select 'Clear'

To launch an entry, right click and select 'Launch'
or while the widget is focused, type the number corresponding to entry.

The entry can be anything. A link, a program, a document, a picture, etc.

== Side Meters ==
On the far right, there are four meters. To set these, go into
preferences and check the \"Side Meters\" tab. The DISK meter will not
show anything until you set a drive for it to monitor. To do this,
right click on the DISK text and select the drive from the context
menu. If you do not see drives in the context menu reposition the
mouse and try again.

Note: The Virtual Memory meter isn't very accurate.

== How To Read the Face ==
Ok, so you know the more beat up and bloody he gets, the worse off
your system is. The blood corresponds to CPU load and memory load.
When he grimices in pain, your CPU load just went up. There are two
levels of this. And finally, when he's grinning, your CPU load just
went down.

It's fun to enable face-mode, and put the face in the middle of an
analog clock face, or interacting with something on your desktop, etc.
You get the idea.

== Performance Concerns ==
Since I use a lot of custom fonts, Konfabulator uses more memory
than you'd think. For each digit, K! loads a copy of the entire font
table. I might be able to reduce it further by breaking up the images
into separate files, but I'm satisfied with the status quo. Scaling
the graphics larger for the bar exponentially increases the number
of page faults generated.

The graphics are copyright 1993 ID Software, and used with permission.


1.4.1 (9.10.2006)
+ fixed zOrder of new custom hovertips
+ turned off debug mode
* updated icon

1.4 (7.23.2006)
+ can Empty Trash from widget
+ new custom hovertips show even when widget is not focused

1.3 (1.29.2006)
+ Updates split up to be more efficient, some meters can be disabled now.
  (fixes problems people had with high utilization when accessing the disk)
* Digit packaging changed, should very slightly reduce pagefaults and
  memory usage, but increases package size.
+ New weird face shows up less often.
- fixed bug where volume didn't change scale.
+ widget now takes a 5 second nap when computer wakes up

1.2 (1.5.2006)
+ New Option: Reverse CPU/Memory. When enabled, CPU/Memory count down from 100,
  akin to health going from 100 to 0
- fixed bug that kept some faces from appearing
+ Weapon Slots now active. See documentation above to see how to use them.
* minor optimizations
+ Tooltips added to side meters
* Changed free space calculation method to Round from Floor

1.1 (12.27.2006)
+ The four meters on the right are now determined by preferences
+ Added WiFi side meter option
+ Disk meter has to be set by the user now (right click on \"DISK\")
- Redid some code because of the 3.0 Engine 

- bugfix

- First Public Release

  reduce pagefaults
  battery usage statistics
Neato! Nerd

Check yo queries before you wreck yo site

When building high performance Django sites, keeping the number of queries down is essential. And just like controlling technical debt and maintaining test coverage, being successful means making monitoring queries a natural part of your workflow. If your momentum has to be stopped to examine database queries, you’re not going to do it. The solution for most developers is Django-Debug Toolbar, which sits off to the side in your web browser, but I’m going to share the way I do it.

I do like Django-Debug Toolbar, but most of the time, it just gets in my way: it only works on pages that return HTML, the overhead adds to the response time, and it modifies the response (adding to the response size and render time). What I prefer is displaying SQL queries along HTTP requests in runserver’s output:

Terminal Output with Colorizing Output and SQL

There’s three parts to getting this to work:

  1. ColorizingStreamHandler — this lets me distinguish http requests (gray/info) from SQL queries (blue/debug)
  2. ReadableSqlFilter — this reformats output by stripping the SELECT arguments so you can focus on the WHERE clauses
  3. Opt-in — having SQL spat out everywhere can be distracting, so it’s opt-in with an environment variable

Getting started

ColorizingStreamHandler and ReadbleSqlFilter are a logging handler and a logging filter packaged in project_runpy. Add it to your Django project with pip install project_runpy (no installed apps changes needed). They get thrown into your Django logging configuration like:

[python]LOGGING = {
    'version': 1,
    'disable_existing_loggers': False,
    'root': {
        'level': os.environ.get('LOGGING_LEVEL', 'WARNING'),
        'handlers': ['console'],
    'filters': {
        'require_debug_false': {
            '()': 'django.utils.log.RequireDebugFalse',
        'require_debug_true': {
            '()': 'django.utils.log.RequireDebugTrue',
        'readable_sql': {
            '()': 'project_runpy.ReadableSqlFilter',
    'handlers': {
        'console': {
            'level': 'DEBUG',
            'class': 'project_runpy.ColorizingStreamHandler',
    'loggers': {
        'django.db.backends': {
            'level': 'DEBUG' if env.get('SQL', False) else 'INFO',
            'handlers': ['console'],
            'filters': ['require_debug_true', 'readable_sql'],
            'propagate': False,
        'factory': {
            'level': 'ERROR',
            'propagate': False,


I found that the logging can get obtrusive. It’ll show up in your runserver, your shell, when you run scripts, and even in your iPython notebooks. So using a short environment variable makes it easy to flip on and off. I’m also using project_runpy’s  env environment variable getter, but if you don’t want that, I suggest using  if 'SQL' in os.environ to avoid how  '0' and  'False' evaluate to  True when reading environment variables.

I was afraid that adding ReadableSqlFilter would add a performance penalty, but I don’t notice any. The extra verbosity is also nice when you have a view with several expensive queries because you can see them run before the page is rendered.

How to use this

Just code as usual! If you’re like me, you keep at least part of your runserver terminal visible. If you are me, hello! Use your peripheral vision to look for long flashes of blue text. When you do, you know you’ve hit something that’s making too many queries. The main culprits will be missing select_relateds and prefetch_relateds. Eventually, you’ll develop a sixth sense for when your querysets could be better, and then you’ll look back at your old code like this:



A warning: don’t think too much about the query time reported. You can’t compare database queries done locally with what happens in production. One thing we did confirm is 10 one second queries are much slower than one 10 second query, even in the same AWS availability zone. Something that isn’t as obvious when everything is local. Just pay attention to the number of queries.

Finally, if you really want to keep your database hits low as you continue to develop, document them in your tests. Django makes an assertion available called  assertNumQueries that you can throw in your tests just to document how many queries an operation takes. They don’t even have to be good; for example, I wrote these scraper tests to document that my code currently makes too many database queries. It’s similar to making sure your views return 200s. Make sure you know how many queries you’re getting yourself into.

Too Many Queries

Finish Writing Me Plz Nerd

Demo: Fuckin A

I’ve been learning Ansible off and on and off for the past year. I have so many complaints, but I still think it’s the least worst provisioning system out there.

It’s online at, but I’ve also embedded it below:

I’d go more into Ansible, just just thinking about it make me so angry. So instead, I want to go over some of the new (JavaScript) things I learned working on this project.


I’ve converted some projects from Sass to LibSass before, but this is the first time I used LibSass from the start. Not having to mess with a Gemfile and Bundler is so freeing. It’s too bad there are still so many bugs in the libsass, node-sass, grunt-sass chain. They finally fixed sass2scss in the summer, I think Sass maps are still messed up. Despite this, I would definitely go with LibSass first.


This is the first project I’ve used `grunt-contrib-connect`. Prior, I would have just used `python -m SimpleHTTPServer`. What I like about using Connect is how it integrates with Grunt, and how I have better control over LiveReload without having to remember to enable my LiveReload browser extensions.


I’ve experimented with Browserify before with, where I had previously experimented with RequireJS and r.js. I was very happy with my experience with it for this project:

  1. It found my node modules automagically. I didn’t know it did that. I just tried it and it worked.
  2. It made writing JavaScript tests so much easier. I did something similar with text-mix, but that used Mocha, and I did not have a pleasant experience setting that up. Seriously, why are there so many Grunt/Mocha plugins? And why do so many of them just not work? This time, I used Nodeunit, which was available as a Grunt contrib plugin and a breeze to set up.

If you don’t need a DOM, writing simple assertion tests is definitely the way to go. If it’s faster to write tests, you’re more likely to write them. And best of all, since Browserify runs in an IIFE, it doesn’t put `module` or `define` into the global scope and mess up everything.


Testing Ansibile Playbooks with Vagrant

I’ve been interested in Ansible for a long time now, and thanks to my coworker’s expertise, I’ve been able to get my feet wet. My main barrier has been finding a way to run playbooks.

  • For my first attempt at learning Ansible, I tried running it locally using `ansible_connection=local`. But I found that running it on a system where changes persisted made it hard to trace what was going on. The final straw was trying to locate guides and resources about how to use Ansible was so difficult. You need documentation for how to read the documentation.
  • For my second attempt at learning Ansible, I tried running Vagrant in Ubuntu 14.04, but for some reason, it would not even install.
  • For my third attempt at learning Ansible, I tried running it against a Docker container, but the lack of ssh and pid 1 made re-creating the full experience too difficult. I found that I could get 80% of the way there using tricks from and
  • For my fourth attempt at learning Ansible, I tried getting Vagrant up and running on my Windows machine, and it worked! It also started working on my Ubuntu machine too.

Then I started following reading:, but modified it for my own purposes. I got this working in Windows first, but everything here works exactly the same in Linux and OSX too.

I made a few minor modifications I’ll go over now. I changed the base box to be similar to what we use in production and to one that wasn’t two years old. In my case, trusty64:

$ vagrant init ubuntu/trusty64

Then, I adjusted the Vagrantfile to use a shell provisioner. Here’s a mine (with comments removed):

Vagrant.configure(VAGRANTFILE_API_VERSION) do |config| = "ubuntu/trusty64"
  config.hostname = "ansible-test"
  config.vm.box_check_update = false "public_network"
  config.vm.provision :shell, :privileged => false, :path => ''

And this is contents of referenced above:

echo `whoami`
sudo apt-get update -qq
sudo apt-get install -y avahi-daemon

Using avahi/zeroconf gives me a hostname on my local network so I can use a predictable and human friendly hostname.

Let’s get started!

$ vagrant up

And now let’s run our playbook, jenkins.yml

$ ansible-playbook --private-key=~/.vagrant.d/insecure_private_key -u vagrant jenkins.yml

To get it to connect to ansible-test.local, I hacked my  /etc/ansible/hosts file according to these instructions:

So how do I feel about Ansible? It’s meh. I still hate it, but I hate it less than Chef and Puppet. I hate them all so much.

Let the hate flow through you


Best Practices Case Study

Optimizing Docker Image Size For Real

I’ve come across tips on how to keep Docker images small and Dockerfiles with strange lines that seem to exist only to optimize image size. Well, it turns out they’re all wrong.

They may have an effect with flat Docker images, but everything else (i.e. 99% of what people do), cleanup steps are just extra steps. When Docker builds an image from a Dockerfile, every step is a checkpoint, and every step is saved. If you add 100 MB in one step, then delete it the next, that 100 MB still needs to be saved so other Dockerfiles with the same step can reuse it.


REPOSITORY               TAG             IMAGE ID            CREATED             VIRTUAL SIZE
test/baseline            latest          7b590dec9b43        7 hours ago         272.6 MB
test/baseline_lines      latest          e165025980f7        9 minutes ago       272.6 MB
test/baseline_lists      latest          b40f9e108a93        About an hour ago   272.6 MB
test/combo               latest          744b502e0052        2 seconds ago       269.8 MB
test/combo2              latest          be8f1c1de02e        About an hour ago   249.8 MB
test/combo3              latest          da948e2838d9        About an hour ago   249.8 MB
test/install             latest          e7cadcbb5a05        12 hours ago        269.8 MB
test/install_clean       latest          dd1383285e85        12 hours ago        269.8 MB
test/install_lists       latest          e55f6f8ebac8        12 hours ago        269.8 MB
test/purge               latest          ef8c2aa7400b        About an hour ago   273.5 MB
test/remove              latest          75e3e5c4e246        About an hour ago   273.5 MB

Hypothesis: Docker’s base Ubuntu image does not need `apt-get clean`

I did an experiment around Docker 0.6. I think my conclusion was that `apt-get install … && apt-get clean` saved a few megabytes. But I head that you didn’t need to do that. If you compare the “test/install” and “test/install_clean” size, you’ll see there is no difference. So you don’t need `apt-get clean`.

Hypothesis: `rm -rf /var/lib/apt/lists/*` saves some space

I’ve been seeing a lot of Dockerfiles lately with this line. Including lots of official Docker images. If those guys are all doing it, surely it must have some effect. Nope.

Hypothesis: Combining similar lines saves space

There’s some overhead for each line in a Dockerfile. How significant is it? Well, it turns out it’s not. What I did find out though, is that it does save a significant amount of time and saves a lot of disk thrashing. So combining lines does not save space, but saves time.

Hypothesis: Combining multiple steps saves space

This makes sense. If you skip making checkpoints, you’re not storing intermediate states. And it turns out this is the only way to get a Docker image made from a Dockerfile smaller. But this is at the cost of readability, and more importantly, at the cost of reduced redundancy between images.

Hypothesis: `apt-get purge` saves some space

Well this hypothesis seems silly now. But I see it used now and then. Deletions do not save space.


Write your Dockerfiles the same way you run commands. Don’t prematurely optimize by adding extra cruft you saw someone else do. If you’re actually worried about image size, use some sort of automation to rebuild Docker images behind the scenes. Just keep that logic out of the Dockerfile. And always keep on measuring. Know your bottlenecks.

Best Practices Finish Writing Me Plz Nerd

Managing Technical Debt in Django Projects

This is fine

I’ve been thinking about this subject a lot, and I’ve been meaning to write something. Rather than procrastinate until I have a lot of material, I’m going to just continuously edit this post as I discover things. Many of these principles aren’t specific to Django, but most of this experience comes from dealing with Django.

Some of these tips don’t cost any time, but some involve investing extra time to do things differently. It’s in the name of saving time in the long run. If you’re writing a few lines of JavaScript that’s going to be thrown away in a day, then you shouldn’t waste time building up a castle of tests around it. Managing technical debt is a design tradeoff. You’re sacrificing some agility and features for developer happiness.

Don’t reuse app names and model names

You can have Django apps with the same name. And you can have models with the same name, but your life will be easier if you can reference a model without having to think about which app it came from. This makes it easier to understand the code; makes it easier to to use tools (grep) to analyze and search makes it easier to use shell_plus, which automatically loads every model for you in the shell.

Leave XXX comments

You should leave yourself task comments in your code, and you should have three levels (like critical, error, warning or high, medium, low) for the priority. I commonly use  FIXME for problems,  TODO for todos,  DELETEME for things that really should be deleted,  DEPRECATED for things I really ought to look at later, and  XXX for documenting code smells and anti-patterns. Some examples:

  • FIXME this will break if user is anonymous
  • TODO  make this work in IE9
  • DELETEME This code block is impossible to reach anyways
  • DEPRECATED use new_thing.method instead of this
  • WISHLIST make this work in IE8
  • XXX magic constant
  • FIXME DELETEME this needs csrf

The comment should be on the same line, so when you grep TODO, you’ll be able to quickly scan what kind of todos you have. This is what other tools like Jenkin’s Task Scanner expect too. Many people say you shouldn’t add TODO comments to your code. You should just do them. In practice, that is not practical, and leads to huge diffs that are hard to review.

There is such a thing has too many comments. For example, instead of writing a comment to explain a poorly named variable, you should just rename the variable.

Naming Things

One of Phil Karlton’s famous two hard things, finding the write names can make a huge difference. With Django code, I find that I’m happiest when I’m following the same naming conventions as the Django core. You should be familiar with the concepts of Hungarian notation and coding without comments (remember the previous paragraph?).

Single letter variables are almost always a bad idea, except with simple code. They’re un-greppable and except for the following, have no meaning:

  • i (prefer idx) — a counter
  • x — When iterating through a loop
  • k, v — Key/Value when looping through dict items
  • na count/total/summation (think traditional for-loop)
  • a, b — When looking at two elements, like in a reduce function (very rare)

An exception is math, where x and y, and m and n, etc. are commonplace.

If the name of your variable implies a type, it should be that type. You would not believe how common the name of the variable lies.

When writing code, it should read close to English. Getter functions should begin with “get_”. Booleans and functions that return booleans commonly begin with “is_” (though anything that is readable and obvious will work).

Write tests

My current philosophy is “if you liked it then you should have put a test on it”. The worst part of technical debt is accidentally breaking things. To its credit, Django is the best framework I’ve used for testing. Unit tests are good for TDD, but functional tests are probably better for managing technical debt because they verify the output of your system for various inputs. Doing TDD, getting 100% coverage, and taking into account edge cases… never happens in practice. That does not mean you shouldn’t try. Adding tests to preventing regressions is the second best thing you can do. And the best thing you can do is to write those tests to begin with.

Get coverage

Running coverage is commonly done at the same time as tests. I skip the html reports and use  coverage report from the command line to get faster feedback. When you have good coverage, you can have higher confidence in your tests.

Be cognizant of when you’re creating technical debt

Of course, every line of code is technical debt, but I’ve started adding a “Technical Debt Note” too all pull requests. This is inspired by how legislation will have a fiscal note to assess how much it would cost. Bills can get shot down because they cost to much for what they promise. Features should be the same. Hopefully, you’re already catching these before you even write code, and you’re writing small pull requests. If we find that a pull requests increases technical debt to an unreasonable degree, we revise the requirements and the code.

Clean as you cook

Most people dread cooking because of the mountain of mess that has to get cleaned after the meal. But if you can master cleaning as you cook, there’s a much more reasonable and manageable mess. As you experiment with code, don’t leave behind unused code clean up inconsistencies as you go. Don’t worry about deleting something that might be useful or breaking something obscure. That’s what source control and tests are there for. Plan ahead for the full life cycle. That means if you’re experimenting with a concept, don’t stop when it works: update everything else to be consistent across the project. If it didn’t work, tear it down and kill it. Don’t get into a situation where you have to support two (or three or four or more) different paradigms.

The Boy Scout rule

“Always leave the campground leaner then you found it”. Feel free to break paradigm that a pull request should only do one thing. If you happen to clean something while working on a feature, there’s no shame in saying you took a little diversion.

Make it easy for others to jump in

Projects with a complicated bootstrapping processes are probably also difficult to maintain. Wipe your virtualenv once in a while. Wipe your database. If it’s painful, and you have to do it often enough, you’ll make it better. Code that doesn’t get touched often has the most technical debt.

tetris game over

Educate your organization that they can’t just ask for a parade of features

This problem fixes itself, one way or another. Either you keep building features and technical debt until you’re buried and everything comes to a standstill and you yell at each other, or you find a way to balance adding features.

Prevent Dependency Spaghetti

Just like how you should try to avoid spaghetti code, having a lot of third party apps that try to pull in dependencies will come back to bite you later on.

  1. Specify requirements with ==, not >=. Not every package uses semantic versioning. And using semver does not guarantee that a minor or bugfix release won’t break something.
  2. Don’t specify requirements of requirements. This is to avoid an explosion of requirements to keep track of.
  3. There’s no easy way to know when it’s safe to delete a requirement. Even if you have good test coverage, your test environment is not the same as production. For example, you can safely delete psycopg2 and still run your tests, but have fun trying to connect to your PostgreSQL database in production.

Don’t support multiple paths

If you’re writing a library consumed by many people, supporting get_queryset and get_query_set is a good idea. For yourself, only support one thing. If you have an internal library that’s used by multiple parts of the code base and you want add functionality while preserving backwards compatibility, you can write a compatibility layer, but then you should update it all within the same pull request. Or at least create an issue to clean it up. Supporting multiple code paths is technical debt.

Avoid Customizing the Admin

The moment you start writing customizations for the admin, you’ve now pinned yourself to whatever the admin happened to be doing that version. Unless you’re running automated browser tests to verify their functionality, you’re setting yourself up for things to break in the future. The Django admin always changes in major ways every version, and admin customizations always have weak test coverage.

Do Code Review and Peer Programming

Code review makes sure that more than one person’s input goes into a feature, and peer programming takes that even further. It helps make sure that crazy functionality and hard to read code doesn’t get into the main code base that others will then have to maintain. If you’re a team of one, do pull requests anyways. You’ll be amazed at all the mistakes and inconsistencies you’ll find when you view your feature all at once. Even better: sleep on your own pull requests so you can see them in a new light.

Don’t Write Unreadable Code

Code review and peer programming are supposed to keep you from writing unmaintainable code. We’ve embraced linters so that we can write code in the same style, coverage so we know when we need to write tests, but what if we could automatically know when we were writing complicated code? We can, using radon or PyMetric‘s McCabe’s Cyclomatic Complexity metric.

Additional Resources

  • Docker and DevOps by Gene Kim, Dockercon ’14
  • Inheriting a Sloppy Codebase by Casey Kinsey, DjangoCon US ’14

Special thanks to Noah S.

Case Study Nerd

How to Ignore the Needle Docs

At PyCon 2014, I learned about a package called “needle” from Julien Phalip’s talk, Advanced techniques for Web functional testing. When I tried using it with a Django project, I immediately ran into problems:

  1. The needle docs aren’t written for Django, so they don’t explain how to use NeedleTestCase with LiveServerTestCase.
  2. I wasn’t using nose as my test runner, and didn’t want to start using it just to run Needle.

The first problem turned out to be easy; use both:

class SiteTest(NeedleTestCase, LiveServerTestCase):

The second problem wasn’t that bad either. If you examine the Nose plugin Needle adds, it just adds a  save_baseline attribute to the test case.

There were a lot of random hacks and tweaks I threw together. I think the best way to show them all is with an annotated example:

import os
import unittest

from django.test import LiveServerTestCase
from needle.cases import NeedleTestCase

# This is a configuration variable for whether to save the baseline screenshot
# or not. You can flip it by manually changing it, with an environment variable
# check, or monkey patching.

# You should be taking screenshots at multiple widths to capture all your
# responsive breakpoints. Only the width really matter,s but I include the
# height for completeness.
    (1024, 800),  # desktop
    (800, 600),  # tablet
    (320, 540),  # mobile

# To keep the test runner from running slow needle tests every time, decorate
# it. In this example, 'RUN_NEEDLE_TESTS' has to exist in your environment for
# these tests to run. So you would run needle tests like:
#     RUN_NEEDLE_TESTS=1 python test
@unittest.skipUnless('RUN_NEEDLE_TESTS' in os.environ, 'expensive tests')
class ScreenshotTest(NeedleTestCase, LiveServerTestCase):
    # You're going to want to make sure your pages look consistent every time.
    fixtures = ['needle.json']

    def setUpClass(cls):
        Sets `save_baseline`.

        I don't remember why I did it here. Maybe the timing didn't work when
        I put it as an attribute on the test class.
        cls.save_baseline = SAVE_BASELINE
        super(ScreenshotTest, cls).setUpClass()

    def assertResponsive(self, scope, name):
        """Takes a screenshot for every responsive size you set."""
        for width, height in SIZES:
            self.set_viewport_size(width=width, height=height)
                    # include the name and browser in the filename
                    '{}_{}_firefox'.format(name, width)
            except AssertionError as e:
                # suppress the error so needle keeps making screenshots. Needle
                # is very fickle and we'll have to judge the screenshots by eye
                # anyways instead of relying on needle's pixel perfect
                # judgements.

    def test_homepage(self):
        urls_to_test = (
            ('/', 'homepage'),
            ('/login/', 'login'),
            ('/hamburger/', 'meat'),
            ('/fries/', 'potatoes'),
            ('/admin/', 'admin'),
        for url, name in urls_to_test:
            self.driver.get(self.live_server_url + url)
                # for now, I always want the full page, so I use 'html' as the
                # scope for my screenshots. But as I document more things,
                # that's likely to change.
                # passing in a human readable name helps it generate
                # screenshots file names that make more sense to me.

Well I hope that that made sense.

When you run the tests, they’re saved to the ./screenshots/ directory, which I keep out of source control because storing so many binary files is a heavy burden on git. We experimented with  git-annex but it turned out to be more trouble than it was worth.

My typical workflow goes like this:

  1. Make sure my reference baseline screenshots are up to date: git checkout master && grunt && invoke needle --make
  2. Generate screenshots for my feature branch: git checkout fat-buttons && grunt && invoke needle
  3. Open the screenshots directory and compare screenshots.

In that workflow,  grunt is used to generate css,  invoke is used as my test runner, and  --make is a flag I built into the needle invoke task to make baseline screenshots.

Now I can quickly see if a change has the desired effect for multiple browser widths faster than it takes to actually resize a browser window. Bonus: I can see if a change has undesired effects on pages that I would have been too lazy to test manually.

Next steps: I still haven’t figured out how to run the same test in multiple browsers.