About a year ago, I wrote a little about why I’m not using Piston. Piston appears to be dead! There hasn’t been a commit since September which is almost a full year ago. This project was touted as “the way” to do REST APIs in Django and I’m sad that it doesn’t seem to be maintained. I saw some other forks of the project on Github, but there still doesn’t seem to be much work on it lately. Does anybody know what happened?
webservices
Piston Looks Good, But I’m Not Using It
Edit (November 2, 2012): This is horribly outdated. Use class-based views or tastypie.
Firstly, I’ve been missing in action for a few months and I apologize to you, my loyal reader, for that. Without making excuses (here comes the excuses), work has been picking up, my girlfriend moved from about 15 miles away to only about 8 blocks away and Starcraft II is in beta. Regardless, I’m back in the Python action. WoooHooo!
REST interfaces & Django
This post is somewhat of a follow-up on my post on RESTful Django web services because I didn’t really talk in my previous post about Piston. Piston (sometimes django-piston) is a library for creating RESTful services in Django and it supports some of the features that I spoke about in my previous post such as good caching support with Django’s cache framework, different output formats (eg. XML & JSON) via what Piston calls emitters, and the ability but not the requirement to use Django models as REST resources. I don’t know how I missed Piston before, but people blog (*) about it and it has made the rounds on the Django User’s list. However, even after looking closely at it, I decided not to go with it. In this post I’m going to talk about what I did and did not like and why I rolled my own REST micro-framework. That almost sounds like I’m giving myself too much credit given that my micro-framework is only ~30 lines.
(*) BTW, Despite the fact that Eric updates his blog somewhat infrequently (sounds familiar) it is well worth a read.
Piston: the good
Piston ships with quite a bit of good documentation and allegedly is used to power some of BitBucket’s services — lending to its credibility. Specifically, I liked the fact that it plugged directly into Django models. You simply write a short Handler for your model explaining what fields to expose and you’re mostly done.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
import re from piston.handler import BaseHandler from myapp.models import Blogpost class BlogPostHandler(BaseHandler): allowed_methods = ('GET') fields = ('title', 'content', ('author', ('username', 'first_name'))) exclude = ('id', re.compile(r'^private_')) model = Blogpost def read(self, request, post_slug): post = Blogpost.objects.get(slug=post_slug) return post |
It effectively wraps up your handler and does all the JSON/XML/YAML serialization for you while still giving you the ability to customize it. On top of this, it plugs in nicely with Django’s form validation and allows you to do some other nice features like throttling requests based on which user does it.
Piston: the bad & the ugly
I started to look at Piston, but because I wasn’t using throttling, using OAuth, outputting anything other than JSON and I wasn’t tying to models I didn’t think that Piston bought me anything. In reality, it wasn’t doing anything my me other than properly returning HttpResponseNotAllowed. My other issue is that this project involved different outputs based on HTTP headers. For example, a GET on a certain URL would return JSON formatted data (a read in the CRUD world) if an HTTP header was present and an HTML page presenting that data if it wasn’t. Piston uses different emitters based on a request parameter format (eg. /path/resource/?format=JSON). Piston gets you up and running quickly, but it didn’t fit my use case.
Also, this is a little nitpicky, but when I see something like:
1 |
return rc.FORBIDDEN # returns HTTP 401 |
I cringe a little bit considering that status code 403 is the correct status code for Forbidden. There’s a ticket for this already. Why did Piston define constants for returning various status codes anyway when that functionality is already built into Django. Is rc.DELETED so much easier than HttpResponse(status_code=204)? Perhaps it’s a little clearer and Django really should have HttpResponse subclasses for even the less common responses, but I think this definitely involves repeating yourself (and Django’s mantra is don’t repeat yourself).
The solution
I always wondered why Django didn’t allow for routing URLs based on the HTTP method: It seems like such a common use case. The developers discussed it back in 2006, but in the end it was decided that building only the simple case was best as it yielded a relatively clean urls.py. Building off of that thread, the example in the Django book (search for “method_splitter”) and another blog post, I rolled a little framework to meet my needs instead of using something like Piston.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 |
## utils/dispatcher.py from django.http import HttpResponseNotAllowed # see rfc 2616 - http://www.ietf.org/rfc/rfc2616.txt s9.2 - s9.9 HTTP_METHODS = ('GET', 'POST', 'PUT', 'HEAD', 'TRACE', 'DELETE', 'OPTIONS', 'CONNECT') def service_dispatcher(request, *args, **kwargs): """ Routes requests to the correct view method based on the HTTP method """ # loop over all possible HTTP methods and find the appropriate service allowed_methods = [] appropriate_service = None for method in HTTP_METHODS: service_view = kwargs.pop(method, None) if service_view is not None: # store legal HTTP methods in case we need to return a 405 allowed_methods.append(method) # found the correct service method if request.method == method: appropriate_service = service_view # if the correct service was found, call it # otherwise return a 405 - method not allowed - error if appropriate_service is not None: return appropriate_service(request, *args, **kwargs) else: return HttpResponseNotAllowed(allowed_methods) ## urls.py from django.conf.urls.defaults import * from myapp.utils.dispatcher import service_dispatcher from myapp.blog import services urlpatterns = patterns('', url(r'^/myapp/blog/$', service_dispatcher, {'GET': services.blog_get, 'POST': services.blog_post}), ) |
I found this to be a much simpler and easily extensible. The argument against this is that urls.py becomes bigger, but in a lot of ways I found this to be clearer. From reading the urlpatterns, I can quickly tell exactly what gets called in each case. In addition, routing differently based on HTTP headers, cookies, the source or anything else becomes as simple as adding a parameter and a little code to service_dispatcher.
In the end, it’s wasn’t that I didn’t like Piston, it’s just that I didn’t need it.
RPC and Authentication
I’m working on adding support for authenticated service calls to RPC4Django built on top of Django’s user authentication. While doing this, I took a brief look around at how other projects implemented authentication for XMLRPC or JSONRPC. Without exception, they all implemented it such that the username and password was part of the RPC call like so:
1 2 3 4 5 6 7 8 |
from django.contrib.auth import authenticate def myAuthenticatedMethod(user, password, otherparams): # authenticate user user = authenticate(username=user, password=password) # verify the user is valid and has the appropriate permissions # perform method actions |
Some of them abstracted the actual username and password checking into a decorator, but in the end, the RPC call had the username and password in the parameters. It seemed bulky and out of place. This led to an analysis about authentication and authorization and what should be handled where. As a little spoiler, I don’t like the idea of sending the username and password in the RPC parameters one bit.
Authentication & Authorization
In applications, authentication is the process that confirms the identity of the user. Usually this takes the form of a login form, HTTP basic authentication, or something similar. Authorization is the process to determine whether the user has sufficient privileges to perform the specified action. This takes the form of permission checks based on the authenticated user. Therefore, authentication must come before authorization.
Fortunately, Django’s user authentication helps with both authentication and authorization. The authenticate
method checks a username and password against the set of Django users and gets the user object if everything goes well. Once this user object is retrieved, permissions can be checked using the has_perm
method. Django has a pretty easy way to create new permissions based on your application’s logic. Permissions have to be checked at the specific method level since permissions are closely tied to the application logic. I like the idea of abstracting much of it into a decorator though. The only remaining question is: where does the username and password come from?
An Example from the Real World
Why should every RPC method need to be specially written to accept the login credentials and authenticate the user? This makes the method only usable as an RPC method and not useful at all to the rest of the project which is bad for code reuse. Amazon s3, a commercial web service for storing files, is a perfect example of the proper way to authenticate and authorize users. With s3, the login information is contained in the HTTP header in a manner similar to HTTP basic authentication and in this way the request can be rejected earlier based on login credentials before the request even routes to the proper method requested. Permission checking, seeing whether the user is allowed to store new files for example, still needs to be done at the method level but at least the identity of the user is known.
Implementation and Demo
For RPC4Django, I’m proposing that authentication be handled at a higher level — with basic HTTP authentication for example. To illustrate this, I set up an https RPC4Django demo site that requires a username and password (rpc4django/rpc4django). The demo site requires that you accept a self-signed certificate. Using python, it is possible to send authenticated requests like so:
1 2 3 |
from xmlrpclib import ServerProxy s = ServerProxy('https://rpc4django:rpc4django@rpcauth.davidfischer.name/') s.system.listMethods() |
The next step is to modify RPC4Django to actually be able to specify permissions for specific methods and to actually log in the users. Expect a release this week.
RESTful Django Powered Web Services
What is REST
REST is an alternative to RPC based web services such as JSONRPC, XMLRPC and SOAP. Instead of simply using HTTP POST for all of its requests (with JSONRPC’s proposed GET implementation excepted) like RPC services do, it uses all the HTTP methods. It usually includes GET, POST, DELETE, PUT and other methods to achieve different results and thereby uses relatively few URIs.
Some people think any web service that makes various services available at URIs is REST. It isn’t. Some people make a service at one URI for getting an object, another for saving the object, another for getting a list of ojects, another for a list of objects matching a certain criteria. This is just RPC outisde of the realm of a specific protocol (like XMLRPC). If people are going to use simple HTTP RPC requests to get all their data but not follow any specific pattern, they’d be better off with a real RPC implementation.
How is it Better (or Worse)
REST has a lot going for it. Because it is a little more “native” to the HTTP protocol, caching can work very efficiently. Depending on language support, it may be be easier to work with a REST interface than working with a more complex RPC specific protocol. Its simplicity can be very beautiful. The RESTful idea of making your data available as a “resource” that links via hypertext to more resources can make REST very powerful.
Instead of
GET http://example.com/testcase/56
<testcase>
<results>
<result>1</result>
<result>2</result>
</results>
</testcase>
You have
GET http://example.com/testcase/56
<testcase>
<results>
<result>http://example.com/result/1</result>
<result>http://example.com/result/2</result>
</results>
</testcase>
In the 2nd method a full test case object can be generated by off of a request to the testcase object and later requests for the result objects. The client would not need to know anything special about testcases or the specific domain as it would in the first example.
RPC also has a lot going for it and there are some cases where I would pick it over REST. Caching is not always very important and when it isn’t, the benefits REST are not as apparent. Most RPC protocols already have the capability out of the box to construct objects (for SOAP — very complex objects) from web service calls. They also usually have introspection methods or WSDL to figure out what services are available. These would need to be built by a crafty REST service developer. RPC, however, doesn’t take much advantage of the HTTP protocol in that most requests are just POST requests with an RPC payload. At the same time, every HTTP implementation supports POST and not all of them support PUT or DELETE.
Next Steps
Django has a few libraries to help with REST interfaces, but nothing I’ve seen is that great. I am going to look into creating one or contributing to an existing project. Here are some things I’d like to see in a REST API:
- In the Django 1.1 development version, PUT, DELETE, OPTIONS and HEAD are available in django.test.client.Client. A REST interface should use them by default and have another mechanism for clients that do not support these lesser used HTTP methods.
- caching and ETags
- different output formats (eg. XML and JSON)
- service/resource discovery or introspection (similar to WSDL or system.listMethods)
- a client library than can generate complex native python objects given a URI
- models and other sources of data as REST resources
- integration at some point with the Django trunk!
What’s Already Out There
There’s a few Django projects for making data available via REST. These efforts seem to have stalled or be in infant stages.
- Django model views — A GSoC project to make Django model data available via REST.This project never seemed to get far off of the ground. I don’t think it has been updated much since 2007.
- Django REST interface — Another GSoC project to create RESTful interfaces.There seem to be some active users of this one and it seems to be more fully featured than the above model views project. However, it has stalled and there hasn’t been much work on it in the past few years.
- Django RESTAPI — Another project to make models available via REST.This project seems to have been more recently updated and it seems ok, but it still isn’t ready for prime time or in Pypi.
- RESTinPy — A sourceforge project that makes data available in REST.
This project seems somewhat advanced but it hasn’t been updated since the first cut was put onto sourceforge and Pypi. - DAPI — Another model to REST mapping module
Reading
- HTTP protocol
- Fielding’s dissertation on REST — the inspiration of REST
- REST Worst Practices — by Jacob Kaplan-Moss of Django fame
- Creating a REST protocol
- Common REST mistakes
- A counterpoint on why Django may not need a REST API
Edit (July 15, 2010): I wrote an update involving Piston, a popular REST framework for Django.
Update (September 7, 2011): There were some updates on what folks in the community were using at Djangocon.