Getting RESTful with web.py

Django may be the Python web framework getting all the press recently, but web.py is definitely a nice, simple framework. One of the nice aspects of web.py is that it exposes methods for the basic HTTP methods (GET, POST, PUT, DELETE, etc.) and uses these methods to process each request from the client. This approach makes it amazingly easy to write a RESTful API.

web.py

import web
class Resource(object):
    def GET(self, name):
        # return the resource
    def POST(self, name):
        # update/create the resource

This approach is very similar to what Google App Engine does with its webapp.

Google App Engine

From the App Engine docs:

from google.appengine.ext import webapp
class Resource(webapp.RequestHandler):
    def get(self):
        # return the resource
    def post(self):
        # update/create the resource

You can still make a nice REST app with Django, but in each view you have to check the request type in the HttpRequest object. (If you are interested in creating a Django REST API, check out django-rest-interface. UPDATE: Simon Willison also has a nice way to get the web.py style dispatching in Django with RestView.)

Django

def resource(request):
    if request.method == 'GET':
        # return the resource
    elif request.method == 'POST':
        # update/create the resource

RESTify web.py

So web.py has a nice way of dealing with the HTTP methods, let's take a look at an example. I created a simple REST-based key-value pair database, called docstore.py. Docstore will store whatever you send it to it with the key you specify. It has a few implementations, one using an in-memory dictionary, one using a file approach, and another using Python's shelve module. For this REST example, let's just use the dictionary storage engine (just a warning, if you use the dictionary approach in a CGI environment, you will lose state after each request).

REST is a great representation of what we will want to do with the docstore.py application. When we want to obtain a copy of the resource from the server, the client (like a browser) issues a HTTP GET request with the key as the resource name. To publish a document, we can use the HTTP PUT method when storing the document and get a UUID back as its key or we can use the HTTP POST method to store the document with a predefined key. In both cases the response back will contain the key that the value is stored at. If we no longer want the document on the server, we use a HTTP DELETE method.

First, we will set up the basics: the imports, urls mappings, and run statement:

import web
import re
import uuid

urls = ('/memory/(.*)', 'MemoryDB')
if __name__ == "__main__":
    web.run(urls, globals())

Additionally, we will specify what a valid key is. We will use a regular expression and a decorator, which help prevent against directory traversal attacks when using the filesystem implementation.

VALID_KEY = re.compile('[a-zA-Z0-9_-]{1,255}')
def is_valid_key(key):
    """Checks to see if the parameter follows the allow pattern of
    keys.
    """
    if VALID_KEY.match(key) is not None:
        return True
    return False

def validate_key(fn):
    """Decorator for HTTP methods that validates if resource
    name is a valid database key. Used to protect against
    directory traversal.
    """
    def new(*args):
        if not is_valid_key(args[1]):
            web.badrequest()
        return fn(*args)
    return new

Now we define an abstract class for the database, creating a common interface for the three implementations of the data store. This abstract class is where the REST goodness is.

We use four of the HTTP methods, GET, POST, PUT, and DELETE. The GET method, when no key is specified will print a list of all the keys in the database. We decorator the methods that use the key to ensure that the key is safe. The PUT method generates a UUID and delegates to the POST method using that UUID as the key. In the POST method, we obtain the contents of the HTTP request using "web.data()".

class AbstractDB(object):
    """Abstract database that handles the high-level HTTP primitives.
    """
    def GET(self, name):
        if len(name) <= 0:
            print '<html><body><b>Keys:</b><br />'
            for key in self.keys():
                print ''.join(['<a href="',str(key),'">',str(key),'</a><br />'])
            print '</body></html>'
        else:
            self.get_resource(name)

    @validate_key
    def POST(self, name):
        data = web.data()
        self.put_key(str(name), data)
        print str(name)

    @validate_key
    def DELETE(self, name):
        self.delete_key(str(name))

    def PUT(self, name=None):
        """Creates a new document with the request's data and
        generates a unique key for that document.
        """
        key = str(uuid.uuid4())
        self.POST(key)

    @validate_key
    def get_resource(self, name):
        result = self.get_key(str(name))
        if result is not None:
            print result

Finally, we create an implementation of the AbstractDB, MemoryDB, that stores all the key-value pairs in a Python dictionary that is shared among instances the MemoryDB (but will be lost when run in a CGI mode). If a key is requested and that key does not exist in the dictionary, we return a 404 Not Found error, using "web.notfound()". web.py defines a few common HTTP errors in webapi.py, including:

  • web.badrequest() : 400 Bad Request error
  • web.notfound() : 404 Not Found error
  • web.gone() : 410 Gone error
  • web.internalerror() : 500 Internal Server error
class MemoryDB(AbstractDB):
    """In memory storage engine.  Lacks persistence."""
    database = {}
    def get_key(self, key):
        try:
            return self.database[key]
        except KeyError:
            web.notfound()

    def put_key(self, key, data):
        self.database[key] = data

    def delete_key(self, key):
        try:
            del(self.database[key])
        except KeyError:
            web.notfound()

    def keys(self):
        return self.database.iterkeys()

Testing it out

In one command window, run the server:

$ python docstore.py
http://0.0.0.0:8080/

And, assuming you have httplib2 installed, open a instance of IDLE:

$ python
Python 2.5.2 (r252:60911, Jul 31 2008, 17:28:52)
[GCC 4.2.3 (Ubuntu 4.2.3-2ubuntu7)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import httplib2
>>> h = httplib2.Http()

Let's store a message with a key of '12345' and then get it back:

>>> h.request('http://localhost:8080/memory/12345','POST','hello')
({'transfer-encoding': 'chunked', 'date': 'Sun, 21 Sep 2008 00:17:08 GMT', 'status': '200', 'server': 'CherryPy/3.0.1'}, '12345\n')
>>> h.request('http://localhost:8080/memory/12345','GET')
({'transfer-encoding': 'chunked', 'date': 'Sun, 21 Sep 2008 00:17:38 GMT', 'status': '200', 'content-location': 'http://localhost:8080/memory/12345', 'server': 'CherryPy/3.0.1'}, 'hello\n')
http://cdn.johnpaulett.com/upload/webpy-datastore12345.png

Now let's delete the object at key 12345, we then get a 404 error if we try to retrieve key 12345:

>>> h.request('http://localhost:8080/memory/12345','DELETE')
({'transfer-encoding': 'chunked', 'date': 'Sun, 21 Sep 2008 00:18:16 GMT', 'status': '200', 'server': 'CherryPy/3.0.1'}, '')
>>> h.request('http://localhost:8080/memory/12345','GET')
({'transfer-encoding': 'chunked', 'date': 'Sun, 21 Sep 2008 00:18:20 GMT', 'status': '404', 'content-type': 'text/html', 'server': 'CherryPy/3.0.1'}, 'not found')

PUT will generate a UUID:

>>> h.request('http://localhost:8080/memory/','PUT','a new message')
({'transfer-encoding': 'chunked', 'date': 'Sun, 21 Sep 2008 00:19:40 GMT', 'status': '200', 'server': 'CherryPy/3.0.1'}, '4dc6a4ca-ebeb-41ac-81b2-5c2764c0fba8\n')
>>> h.request('http://localhost:8080/memory/4dc6a4ca-ebeb-41ac-81b2-5c2764c0fba8','GET')
({'transfer-encoding': 'chunked', 'date': 'Sun, 21 Sep 2008 00:20:09 GMT', 'status': '200', 'content-location': 'http://localhost:8080/memory/4dc6a4ca-ebeb-41ac-81b2-5c2764c0fba8', 'server': 'CherryPy/3.0.1'}, 'a new message\n')

And we can even throw binary data up into the database

>>> f=open('exploits_of_a_mom.png','rb')
>>> h.request('http://localhost:8080/memory/johnnytables','POST',f.read())
...
>>> f.close()
http://cdn.johnpaulett.com/upload/webpy-johnnytables.png

Let's look at the list of keys:

http://cdn.johnpaulett.com/upload/webpy-datastore.png

Conclusion

As you have hopefully seen, web.py offers a very simple way to create a RESTful application. Take a look at the other implementations of the AbstractDB.

Also, check out RESTful Web Services by Leonard Richardson and Sam Ruby for a great description of building RESTful APIs.