Category Archives: Neato!

Neato! Nerd

Check yo queries before you wreck yo site

When building high performance Django sites, keeping the number of queries down is essential. And just like controlling technical debt and maintaining test coverage, being successful means making monitoring queries a natural part of your workflow. If your momentum has to be stopped to examine database queries, you’re not going to do it. The solution for most developers is Django-Debug Toolbar, which sits off to the side in your web browser, but I’m going to share the way I do it.

I do like Django-Debug Toolbar, but most of the time, it just gets in my way: it only works on pages that return HTML, the overhead adds to the response time, and it modifies the response (adding to the response size and render time). What I prefer is displaying SQL queries along HTTP requests in runserver’s output:

Terminal Output with Colorizing Output and SQL

There’s three parts to getting this to work:

  1. ColorizingStreamHandler — this lets me distinguish http requests (gray/info) from SQL queries (blue/debug)
  2. ReadableSqlFilter — this reformats output by stripping the SELECT arguments so you can focus on the WHERE clauses
  3. Opt-in — having SQL spat out everywhere can be distracting, so it’s opt-in with an environment variable

Getting started

ColorizingStreamHandler and ReadbleSqlFilter are a logging handler and a logging filter packaged in project_runpy. Add it to your Django project with pip install project_runpy (no installed apps changes needed). They get thrown into your Django logging configuration like:

[python]LOGGING = {
    'version': 1,
    'disable_existing_loggers': False,
    'root': {
        'level': os.environ.get('LOGGING_LEVEL', 'WARNING'),
        'handlers': ['console'],
    },
    'filters': {
        'require_debug_false': {
            '()': 'django.utils.log.RequireDebugFalse',
        },
        'require_debug_true': {
            '()': 'django.utils.log.RequireDebugTrue',
        },
        'readable_sql': {
            '()': 'project_runpy.ReadableSqlFilter',
        },
    },
    'handlers': {
        'console': {
            'level': 'DEBUG',
            'class': 'project_runpy.ColorizingStreamHandler',
        },
    },
    'loggers': {
        'django.db.backends': {
            'level': 'DEBUG' if env.get('SQL', False) else 'INFO',
            'handlers': ['console'],
            'filters': ['require_debug_true', 'readable_sql'],
            'propagate': False,
        },
        'factory': {
            'level': 'ERROR',
            'propagate': False,
        },
    }
}
[/python]

From: https://github.com/crccheck/crccheck-django-boilerplate/blob/master/project/project_name/settings.py#L84

I found that the logging can get obtrusive. It’ll show up in your runserver, your shell, when you run scripts, and even in your iPython notebooks. So using a short environment variable makes it easy to flip on and off. I’m also using project_runpy’s  env environment variable getter, but if you don’t want that, I suggest using  if 'SQL' in os.environ to avoid how  '0' and  'False' evaluate to  True when reading environment variables.

I was afraid that adding ReadableSqlFilter would add a performance penalty, but I don’t notice any. The extra verbosity is also nice when you have a view with several expensive queries because you can see them run before the page is rendered.

How to use this

Just code as usual! If you’re like me, you keep at least part of your runserver terminal visible. If you are me, hello! Use your peripheral vision to look for long flashes of blue text. When you do, you know you’ve hit something that’s making too many queries. The main culprits will be missing select_relateds and prefetch_relateds. Eventually, you’ll develop a sixth sense for when your querysets could be better, and then you’ll look back at your old code like this:

that-queryset-ain't-right

Continuing

A warning: don’t think too much about the query time reported. You can’t compare database queries done locally with what happens in production. One thing we did confirm is 10 one second queries are much slower than one 10 second query, even in the same AWS availability zone. Something that isn’t as obvious when everything is local. Just pay attention to the number of queries.

Finally, if you really want to keep your database hits low as you continue to develop, document them in your tests. Django makes an assertion available called  assertNumQueries that you can throw in your tests just to document how many queries an operation takes. They don’t even have to be good; for example, I wrote these scraper tests to document that my code currently makes too many database queries. It’s similar to making sure your views return 200s. Make sure you know how many queries you’re getting yourself into.

Too Many Queries

Neato!

A new canvas demo

I finally got around to updating my canvas / sql visualization code. Interacting with the map is now much easier.

And I’ve posted a live demo of it here: http://crccheck.com/demo02

It’s not as refined as google maps, but it’s still pretty fancy. Navigation is done in the minimap, and you can draw on the large map with your mouse pointer.

This time, I made it work in FireFox and Chrome/Safari

Life=Boring Mental Note: Add Category Neato! Nerd

Blast from the Past [2000] my !hme script

I completely forgot about this mIRC script I made back in 2000, but apparently I was ahead of my time. I wrote a crowdsourcing script before there was crowdsourcing. The script was very simple, query it with !hme <name of show> and the script replies with how many episodes it thinks are in the show. To contribute, anyone could say !hme.add <name of show> <number of episodes>. I did have a blacklist of people who couldn’t add, but I don’t think I ever had to use it. You didn’t care about the accuracy of the name of the shows, because the script searched the best fit, and if there were duplicate variations (Evangelion vs Neon Genesis Evangelion), it didn’t matter. I just stored extra information. The last time the script was used… 2002… it had 500 entries.

Life=Boring Neato!

Jamba Juice Secret Menu

Tried

  • White Gummy – delicious+
  • Red Gummy – delicious
  • Strawberry Shortcake – pretty good

Want to Try

  • Pink Starburst
  • Orchard Oasis
  • Peanut Butter and Jelly
  • Skittles
  • Raspberry Dreamin’
  • Pineapple Dreamin’
  • Apple Pie

Not sure if my Jamba Juice knows about

  • Fruity Pebbles
  • Push Pop
  • Bluetopia
  • Sourpatch Kid
  • Now and Later
  • chocolate gummy bear
  • Blue Gummy Bear
Neato! Nerd

Using XPath to find email address links

I wanted to write a Greasemonkey script to modify all mailto: links on a page, but to do that, I have to find them first.

Here’s my first attempt:

//a[@href]/text()[contains(.,”@”)]

It’s pretty bad. I wanted to find links starting with “mailto:” but couldn’t figure out how to operate on the href attribute.

second attempt:

//a/@href[contains(.,”mailto:”)]/..

Here you can see that I managed to operate on the href attribute, and then back back up to the a node. I learned how to select the attribute, instead of using the attribute to select the node.

third attempt:

//a[contains(@href,”mailto:”)]

Simplified even further!

fourth attempt:

//a[starts-with(@href,”mailto:”)]

I knew there was a starts with function… I just had to look it up.

Neato!

Chassis blowholes are good for other things

At first, I thought the little shroud on the Antec P182 (also on the P180) was a little annoying because it broke up the horizontal space that I could otherwise use for piling up misc crap. Lately, I’ve been putting my external hard drives on top of the shroud. The airflow helps cool them down. My 7200 rpm 2.5″ drive in the Macally PHR-250CC (an excellent enclosure) gets really hot, and this will definately help.