RPC4Django is Now Hosted in Launchpad

After some discussion in my last post, I decided to host RPC4Django in Launchpad. Every release dating back to 0.1.0 is uploaded and hosted properly there. I also created a 0.1.8 milestone which I hope to work on in the next couple weeks. I tried to request a Launchpad import from subversion but it didn’t go smoothly. Launchpad isn’t really setup to handle imports from password protected subversion repositories to which the password doesn’t give full access. Regardless, all future releases will be from the publicly hosted Bazaar repo in Launchpad.

Updates March 2010 Edition

This post is mainly going to be an update on what I am thinking and what I’ve been working on the past few weeks.

Work

At the beginning of this year, I took a new position (same company) in a security group. Our primary focus is to ensure that the company is shipping secure, OSS compliant, legally compliant code. However, my specific role in that is to develop tools (with Django) to help in making sure that happens. This is an exceptionally interesting project and involves pulling in vast amounts of data (terabytes) from many sources (multiple VCS, multiple databases) and presenting it in a comprehensive manner. This project and my work has led to some good problems:

Some of our databases are MSSql databases. This is a problem since we’re a Linux shop. Pyodbc works great for connecting to MSSql from Linux, but unfortunately, there are some incompatibilities with django-pyodbc. In addition, the project doesn’t seem to be that widely used so it isn’t supported or documented as well as it could be. We are considering sqlalchemy/elixir as well, but I’ve been able to patch up django-pyodbc to get it (mostly) working with the Django trunk. I also have some concerns about the django-pyodbc project as a whole. I’m considering working on this project pretty heavily.

Also, as part of my work, a coworker and I detailed a security flaw we found with urllib2. It resulted in basic authentication credentials being sent to sites that did not request it (and weren’t running SSL).

Future of RPC4Django

I have been considering moving RPC4Django from my personal subversion repository to Google Code or Github. I feel that there are a few advantages of this:

  • It is easier for others to contribute and get involved.
  • A public bug tracker that would let other people easily raise issues instead of emailing me directly. This way we have public archives and the information can be found by anyone interested in RPC4Django.
  • If I were hit by a bus, some one could easily take it over

I might make a mailing list as well. Are there any strong opinions on this?

Django Scripting and the Crontab

Sometimes, you need part of your Django application to run from the command line. These scripts can be caching jobs that run periodically to speed up performance or data collection jobs that pull information from various sources into your application. Although James Bennett has a great article on writing standalone Django scripts, I just wanted to update it with changes that have happened since 2007. I ran into this problem about the same time on both a work project and on a personal project.

The problem with standalone scripts

The basic problem is that you want your script to run in the context of your application. You want to access your databases in the normal way and you want to import modules by the same paths. In general, you can do all of this work yourself by making sure your PYTHONPATH is correct and DJANGO_SETTINGS_MODULE is set properly, but it is so much easier to just create a custom management command. This is especially convenient since it makes your script portable (Windows, Mac & Linux — crontab & schedule tasks) as well easily distributable with your application.

Custom management commands

A custom management command is a command that can be run from manage.py. Essentially, Django requires that you create the following type of structure under your application:

Then, in mycommand.py you must subclass django.core.management.base.BaseCommand like so:

This will allow you to run (mycommand is named for mycommand.py):

or create a crontab entry as follows:

Conclusion

This provides the cleanest, most flexible way to build and support standalone scripts that can be used outside of the web server. It will run in exactly the same environment as your application and the same modules used in your application can be used here. The only problem I ran into was that custom management commands must be run from the same directory as manage.py.

Edit: I added thorough documentation on management commands to ticket #9170

Update: The 1.2 documentation contains the changes from my patch.

Extending Distutils for Repeatable Builds

Distutils is Python’s built-in mechanism for packaging and installing Python modules. It is very convenient for packaging up your source code, scripts and other files and creating a distribution to be uploaded to pypi as I’ve mentioned before. Distutils was discussed (pdf) at PyCon last year and it looks like there are efforts afoot to improve it to add some much needed features like unittesting and metadata. Add-on packages like pip add additional features like uninstallation and dependency management but nothing guarantees that your users have it. Although Python’s packaging and distribution model beats PHP’s hands down, there is still a lot of room for improvement to make it seamless.

Release management

In essence, these issues and enhancements boil down to making release management easier. When releasing your package, you want to make sure that it contains all the appropriate files, is tested and can be installed easily. Distutils helps with the installation, pip with the dependencies and virtualenv (a topic for a later post) helps a lot with testing package interactions. But what about unittests? What about cleaning up after setup.py? What about generating documentation or other files?

Extending distutils

Until all these features get put into distutils, you have to extend it yourself in setup.py. Fortunately, this is not very complicated and can buy you some reliability in your build process. Adding a command like python setup.py test is pretty trivial:

The same sort of functionality could be used to verify any prerequisites not already checked by distutils or pip, generate documentation without external dependencies like Make (although Django supports Python 2.3 before this functionality was available) or to create a uniform way to take source control diffs and submit patches. Executing these commands from one place makes the whole process more consistent and easy to understood. Hopefully the new enhancements to distutils will make the process even better.