REST defines a way to design an API with which you can consume its ressources using HTTP methods (GET, POST, etc) over URLs. Interacting with such an API basically comes down to sending HTTP requests.

In this article, we’ll see which python modules are available to solve this problem, and which one you should use. We’ll test all modules with this simple test case: we would like to create a new Github repository using their RESTful API.

Python and HTTP: too many modules

One known problem with Python is the abundance of modules enabling HTTP communication:

If you’re totally new to Python, this might be confusing, so let me try to clarify things a little bit. First things first: httplib will not solve your problem.

This module defines classes which implement the client side of the
HTTP and HTTPS protocols. It is normally not used directly — the module urllib uses it to handle URLs that use HTTP and HTTPS.

So we’re down with 4 modules that could allow us to communicate with a RESTful API (between you and me, that’s still way too much and should be standardized).

The Github API

Reading the Github API documentation, it appears that to create a new repository, we need to send a POST request tohttps://api.github.com/user/repos, with some input data (the only mandatory input beingname, a string), and our credentials.

Let’s see how to do such a thing with urllib2, httplib2, pycurl and requests.

urllib2

 1import urllib2, urllib
 2github_url = 'https://api.github.com/user/repos'
 3password_manager = urllib2.HTTPPasswordMgrWithDefaultRealm()
 4password_manager.add_password(None, github_url, 'user', '***')
 5auth = urllib2.HTTPBasicAuthHandler(password_manager) # create an authentication handler
 6opener = urllib2.build_opener(auth) # create an opener with the authentication handler
 7urllib2.install_opener(opener) # install the opener...
 8request = urllib2.Request(github_url, urllib.urlencode({'name':'Test repo', 'description': 'Some test repository'})) # Manual encoding required
 9handler = urllib2.urlopen(request)
10print handler.read()

This code is so ugly, it looks like Java. However, once we’ve instanciated the url, the password manager, the opener, a request is only 2 lines of code. With urllib2, a request without data will be interpreted as a GET request, and a request with data will be intepreted as a POST one. The other HTTP methods (PUT, PATCH, DELETE, OPTIONS, HEAD) are not supported. Plus, the documentation is horrible… Clearly, urllib2 is not the way to go.

httplib2

The httplib2 module is a comprehensive HTTP client library that handles caching, keep-alive, compression, redirects and many kinds of authentication.

1import urllib, httplib2
2github_url = 'https://api.github.com/user/repos'
3h = httplib2.Http(".cache") # WAT?
4h.add_credentials("user", "******", "https://api.github.com")
5data = urllib.urlencode({"name":"test"})
6resp, content = h.request(github_url, "POST", data)
7print content

Ok that seems better. The only weird line ish = httplib2.Http(“.cache”). Why would I instanciate anHTTPobject? Shouldn’t theHTTP.requestmethod be available as a function of thehttplib2module? Seems like a design flaw to me, but that’s not a major problem anyway.

The thing is, I lied to you. The previous examples fail miserably, sending back a 401.

Michael Foord explains the mechanism of Basic Authentication in urllib2 in his great blogpost urllib2 – The Missing Manual: HOWTO Fetch Internet Resources with Python

When authentication is required, the server sends a header (as well as the 401 error code) requesting authentication. This specifies the authentication scheme and a ‘realm’. The header looks like :Www-authenticate: SCHEME realm=”REALM”. The client should then retry the request with the appropriate name and password for the realm included as a header in the request

I inspected the header Github sends me back, and it appears that there is no trace ofWww-authenticatein it.

1'cache-control': '',
2'connection': 'keep-alive',
3'content-length': '37',
4'content-type': 'application/json; charset=utf-8',
5'date': 'Sun, 01 Jul 2012 06:29:07 GMT',
6'server': 'nginx/1.0.13',
7'status': '401',
8'x-ratelimit-limit': '5000',
9'x-ratelimit-remaining': '4999'

That’s why, with urllib2 and httplib2, all we get is an error and a 401 status code. The credentials are not sent back after getting a 401 from the server. If we wanted to use one of these modules, we’d then have to write some more code that would intercept the 401 and automatically send back the credentials. We certainly do not want to do that…

pycurl

One other option is pycurl, a python binding of the cURL C library, libcurl.

 1import pycurl, json
 2github_url = "https://api.github.com/user/repos"
 3user_pwd = "user:*****"
 4data = json.dumps({"name": "test_repo", "description": "Some test repo"})
 5c = pycurl.Curl()
 6c.setopt(pycurl.URL, github_url)
 7c.setopt(pycurl.USERPWD, user_pwd)
 8c.setopt(pycurl.POST, 1)
 9c.setopt(pycurl.POSTFIELDS, data)
10c.perform()

This time, the repo is created and we get a201 – Createdstatus code. Yay! However, we can see that pycurl is very low-level: it took us 4 lines to configure a simple request. These lines would either have to be repeated for each request, or be bundled into a function. We’d then have to implement someget,post… functions, which basically comes down to re-inventing the wheel.

Thus, pycurl could be an option, but kind of defeats the purpose Python high-levelness.

requests

Finally, let’s see how to create a Github repo with requests.

1import requests, json
2github_url = "https://api.github.com/user/repos"
3data = json.dumps({'name':'test', 'description':'some test repo'})
4r = requests.post(github_url, data, auth=('user', '*****'))
5print r.json

Wait, is that it? NoCreateManagerWithExtraLargeName()method call? No manual credential sending? Well, no. requests was designed to be an HTTP high level API, supporting all HTTP methods, SSL encryption, proxies, redirection, caching, etc.

I find it perfect for communicating with RESTful APIs, and clearly recommend it over the previous 3 modules! Have a look at their documentation and examples if you’re still not convinced.

Conclusion

requests is the perfect module if you want to repeatedly interact with a RESTful API. It supports pretty much everything you might need in that case, and its documentation is extremely clear. (I’m looking at you, urllib2…)

最后推荐一篇写的不错的RESTful 介绍文章:https://www.guru99.com/restful-web-services.html