Create a docker container for your CherryPy application

In the past year, process isolation through the use of containers has exploded and you can find containers for almost anything these days. So why not creating a container to isolate your CherryPy application from the rest of the world?

I will not focus on the right and wrongs in undertaking such a task. This is not the point of this article. On the other hand, this article will guide you through the steps to create a base container image that will support creating per-project images that can be run in containers.

We will be using docker for this since it’s the hottest container technology out there. It doesn’t mean it’s the best, just that it’s the most popular which in turns means there is high demand for it. With that being said, once you have decided containers are a relevant feature to you, I encourage you to have a look at other technologies in that field to draw your own conclusion.

Docker uses various Linux kernel assets to isolate a process from the other running processes. In particular, it uses control groups to constraints the resources used by the process. Docker also makes the most of namespaces which create an access layer to resources such as network, mounted devices, etc.

Basically, when you use docker, you run an instance of an image and we call this a container. An image is mostly a mille-feuille of read-only layers that are eventually unified into one. When an image is run as a container, an extra read-write layer is added by docker so that you can make changes at runtime from within your container. Those changes are lost everytime you stop the running container unless you commit it into a new image.

So how to start up with docker?

Getting started

First of all, you must install docker. I will not spend much time here explaining how to go about it since the docker documentation does it very well already. However, iI encourage you to:

  • install from the docker repository as it’s more up to date usually than official distribution repositories
  • ensure you can run docker commands as a non-root user. This will make your daily usage of docker much easier

At the time of this writing, docker 1.4.1 is the latest version and this article was written using 1.3.3. Verify your version as follow:

$ docker version
Client version: 1.3.3
Client API version: 1.15
Go version (client): go1.3.3
Git commit (client): d344625
OS/Arch (client): linux/amd64
Server version: 1.3.3
Server API version: 1.15
Go version (server): go1.3.3
Git commit (server): d344625

Docker command interface

Docker is an application often executed as a daemon. To interact with it you use the command line interface via the docker command. Simply run the following command to see them:

$ docker

Play a little with docker

Before we move on creating our docker image for a CherryPy application, lets play with docker.

The initial step is to pull an existing image. Indeed, you will likely not create your own OS image from scratch. Instead, you will use a public base image, available on the docker public registry. During the course of these articles, we will be using a Ubuntu base image. But everything would work the same wth Centos or something else.

$ docker pull ubuntu
Pulling repository ubuntu
8eaa4ff06b53: Download complete 
511136ea3c5a: Download complete 
3b363fd9d7da: Download complete 
607c5d1cca71: Download complete 
f62feddc05dc: Download complete 
Status: Downloaded newer image for ubuntu:latest

Easy right? The various downloads are those of the intermediary images that were generated by the Ubuntu image maintainers. Interestingly, this means you could start your image from any of those images.

Now that you have an image, you may wish to list all of them on your machine:

$ docker images
REPOSITORY TAG IMAGE ID CREATED VIRTUAL SIZE
ubuntu latest 8eaa4ff06b53 10 days ago 188.3 MB

Notice that the intermediate images are not listed here. To see them:

$ docker images -a

Note that, in the previous call we didn’t specify any specific version for our docker image. You may wish to do so as follow:

$ docker image ubuntu:14.10
Pulling repository ubuntu
bf49414948ac: Download complete 
511136ea3c5a: Download complete 
a7cca9443999: Download complete 
dbbd544a49e2: Download complete 
98b540cf0569: Download complete 
Status: Downloaded newer image for ubuntu:14.10

Let’s pull a centos image as well for the fun:

$ docker pull centos:7
Pulling repository centos
8efe422e6104: Download complete 
511136ea3c5a: Download complete 
5b12ef8fd570: Download complete 
Status: Image is up to date for centos:7

Let’s now run a container and play around with it:

$ docker run --rm --name playground -i -t centos:7 bash

[root@7d5761d100e4 /]# ls
bin dev etc home lib lib64 lost+found media mnt opt proc root run sbin selinux srv sys tmp usr var

In the previous command, we start a bash command executed within a container using the Centos image tagged 7. We name the container to make it easy to reference it afterwards. This is not compulsory but is quite handy in certain situations. We also tell docker that it can dispose of that container when we exit it. Otherwise, the container will remain.

[root@7d5761d100e4 /]# uname -a
Linux 7d5761d100e4 3.13.0-43-generic #72-Ubuntu SMP Mon Dec 8 19:35:06 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

This is interesting because it shows that, indeed, the container is executed in the host kernel which, in this instance, is my Ubuntu operating system.

Finally below, let’s see the network configuration:

[root@7d5761d100e4 /]# ip addr show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN 
 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
 inet 127.0.0.1/8 scope host lo
 valid_lft forever preferred_lft forever
 inet6 ::1/128 scope host 
 valid_lft forever preferred_lft forever
12: eth0: <BROADCAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
 link/ether 02:42:ac:11:00:03 brd ff:ff:ff:ff:ff:ff
 inet 172.17.0.3/16 scope global eth0
 valid_lft forever preferred_lft forever
 inet6 fe80::42:acff:fe11:3/64 scope link 
 valid_lft forever preferred_lft forever

Note that the eth0 interface is attached to the bridge the docker daemon created on the host. The docker security scheme means that, by default, nothing can reached that interface from the outside. However the docker may contact the outside world. Docker has an extensive documentation regarding its networking architecture.

Note that you can see containers statuses as follow:

$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
25454ad13219 centos:7 "bash" 4 minutes ago Up 4 minutes playground

Exit the container:

[root@7d5761d100e4 /]# exit

Run again the command:

$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES

As we can see the container is indeed gone. Let’s now rewind a little and do not tell docker to automatically remove the container when we exit it:

$ docker run --name playground -i -t centos:7 bash
[root@5960e4445743 /]# exit

Let’s see if the container is there:

$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES

Nope. So what’s different? Well, try again to start a container using that same name:

$ docker run --name playground -i -t centos:7 bash
2015/01/11 16:09:53 Error response from daemon: Conflict, The name playground is already assigned to 5960e4445743. You have to delete (or rename) that container to be able to assign playground to a container again.

Ooops. The container is actually still there:

$ docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
5960e4445743 centos:7 "bash" About a minute ago Exited (0) 57 seconds ago

There you go. By default docker ps doesn’t show you the containers in the exit status. You have to remove the container manually using its identifier:

$ docker rm 5960e4445743

I will not go further with using docker as it’s all you really need to start up with

A word about tags

Technically speaking, versions do not actually exist in docker images. They are in fact tags. A tag is a simple label for an image at a given point.

Images are identified with a hash value. As with IP addresses, you are not expected to recall the hash of the images you wish to use. Docker provides a mechanism to tag images much like you would use domain names instead of IP address.

For instance, 14.10 above is actually a tag, not a version. Obviously, since tags are meant to be meaningful to human beings, it’s quite sensible for Linux distributions to be tagged following the version of the distributions.

You can easily create tags for any images as we will see later on.

Let’s talk about registries

Docker images are hosted and served by a registry. Often as it’s the case in our previous example, the registry used is the public docker registry available at : https://registry.hub.docker.com/

Whenever you pull an image from a registry, by default docker pulls from that registry. However, you may query a different registry as follow:

$ docker pull hostname:port/path/to/image:tag

Basically, you provide the address of your registry and a path at which the image can be located. It has a similar form to an URI without the scheme.

Note that, as of docker 1.3.1, if the registry isn’t served over HTTPS, the docker client will refuse to download the image. If you need to pull anyway, you must add the following parameter to the docker daemon when it starts up.

 --insecure-registry hostname:port

Please refer to the official documentation to learn more about this.

A base Linux Python-ready container

Traditionnaly deploying CherryPy application has been done using a simple approach:

  • Package your application into an archive
  • Copy that archive onto a server
  • Configure a database server
  • Configure a reverse proxy such as nginx
  • Start the Python process(es) to server your CherryPy application

That last operation is usually done by directly calling nohup python mymodule.py &. Alternatively, CherryPy comes with a handy script to run your application in a slightly more convenient fashion:

$ cherryd -d -c path/to/application/conf/server.conf -P path/to/application -i mymodule

This runs the Python module mymodule as a daemon using the given configuration file. If the -P flag isn’t provided, the module must be found in PYTHONPATH.

The idea is to create an image that will serve your application using cherryd. Let’s see how to setup an Ubuntu image to run your application.

$ docker run --name playground -i -t ubuntu:14.10 bash
root@d91ec7935e33:/#

First we create a user which will not have the root permissions. This is a good attitude to follow:

root@d91ec7935e33:/# useradd -m -d /home/web web
root@d91ec7935e33:/# mkdir /home/web/.venv

Next, we install a bunch of libraries that are required to deploy some common Python dependencies:

root@d91ec7935e33:/# apt-get update
root@d91ec7935e33:/# apt-get upgrade -y
root@d91ec7935e33:/# apt-get install -y libc6 libc6-dev libpython2.7-dev libpq-dev libexpat1-dev libffi-dev libssl-dev python2.7-dev python-pip
root@d91ec7935e33:/# apt-get autoclean -y
root@d91ec7935e33:/# apt-get autoremove -y

Then we create a virtual environment and install Python packages into it:

root@d91ec7935e33:/# pip install virtualenv
root@d91ec7935e33:/# virtualenv -p python2.7 /home/web/.venv/default
root@d91ec7935e33:/# source /home/web/.venv/default/bin/activate
root@d91ec7935e33:/# pip install cython
root@d91ec7935e33:/# pip install cherrypy==3.6.0 pyopenssl mako psycopg2 python-memcached sqlalchemy

These are common packages I use. Install whichever you require obviously.

As indicated by Tony in the comments, it is probably overkill to create a virtual environment in a container since, the whole point of a container is to isolate your process and its dependencies already. I’m so used to using virtual env that I automatically created one. You may skip these steps.

Those operations were performed as the root user, let’s make the web user those packages owner.

root@d91ec7935e33:/# chown -R web.web /home/web/.venv

Good. Let’s switch to that user now:

root@d91ec7935e33:/# sudo su web
web@d91ec7935e33:/# cd /home/web

At this stage, we have a base image ready to support a CherryPy application. It might be interesting to tag that paricular container as a new image so that we can use it various contexts.

web@d91ec7935e33:/# docker ps
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS                    NAMES
675ad8e8752d        ubuntu:14.10        "bash"              7 minutes ago       Up 7 minutes        0.0.0.0:9090->8080/tcp   playground
web@d91ec7935e33:/# docker commit -m "Base Ubuntu with Python env" 675ad8e8752d
web@d91ec7935e33:/# docker images
REPOSITORY          TAG                 IMAGE ID            CREATED             VIRTUAL SIZE
                            78bc8c5c2e3f        6 seconds ago       506 MB
web@d91ec7935e33:/# docker tag 78bc8c5c2e3f lawouach/ubuntu:pythonbase
web@d91ec7935e33:/# docker images
REPOSITORY          TAG                 IMAGE ID            CREATED             VIRTUAL SIZE
lawouach/ubuntu     pythonbase          78bc8c5c2e3f        32 seconds ago      506 MB

We take the docker container and we commit it as a new image. We then tag the new created image to make it easy to reuse it later on.

Let’s see if it worked. Exit the container and start a new container from the new image.

$ docker run --name playground -i -t lawouach/ubuntu:pythonbase bash 
root@66c6b2a5bb08:/# sudo su web 
web@66c6b2a5bb08:/$ cd /home/web/ 
web@66c6b2a5bb08:~$ source .venv/default/bin/activate
(default)web@66c6b2a5bb08:~$

Well. We are ready to play now.

Run a CherryPy application in a docker container

For the purpose of this article, here is our simple application:

import cherrypy

class Root(object):
    @cherrypy.expose
    def index(self):
        return "Hello world"

cherrypy.config.update({'server.socket_host': '0.0.0.0'})
cherrypy.tree.mount(Root())

Two important points:

  • You must make sure CherryPy listens on the eth0 interface so just make it listen on all the container interfaces. Otherwise, the CherryPy will listen only on 127.0.0.1 which won’t be reachable from outside the container.
  • Do not start the CherryPy engine yourself, this is done by the cherryd command. You must simply ensure the application is mounted so that CherryPy can serve it.

Save this piece of code into your container under the module name: server.py. This could be any name, really. The module will be located in /home/web.

You can manually test the module:

$ docker run --name playground -p 9090:8080 -i -t lawouach/ubuntu:pythonbase bash
(default)web@66c6b2a5bb08:~$ cherryd -d -P /home/web -i server
(default)web@66c6b2a5bb08:~$ ip addr list eth0 | grep "inet "
inet 172.17.0.11/16 scope global eth0

The second line tells us the IPv4 address of this container. Next point your browser to the following URL: http://localhost:9090/

“What is this magic?” I hear you say!

If you look at the command we use to start the container, we provide this bit: -p 9090:8080. This tells docker to map port 9090 on the host to port 8080 on the container alllowing for your application to be reached from the outside.

And voilà!

Make the process a little more developer friendly

In the previous section, we saved the application’s code into the container itself. During development, this may not be practical. One approach is to use a volume to share a directory between your host (where you work) and the container.

$ docker run --name playground -p 9090:8080 -v `pwd`/webapp:/home/web/webapp -i -t lawouach/ubuntu:pythonbase bash

You can then work on your application and the container will see those changes immediatly.

Automate things a bit

The previous steps have shown in details how to setup an image to run a CherryPy application. Docker provides a simple interface to automate the whole process: Dockerfile.

A Dockerfile is a simple text file containing all the steps to create an image and more. Let’s see it first hand:

FROM ubuntu:14.10

RUN useradd -m -d /home/web web && mkdir /home/web/.venv &&\

apt-get update && sudo apt-get upgrade -y && \
apt-get install -y libc6 libc6-dev libpython2.7-dev libpq-dev libexpat1-dev libffi-dev libssl-dev python2.7-dev python-pip && \
pip install virtualenv && \
virtualenv -p python2.7 /home/web/.venv/default && \
/home/web/.venv/default/bin/pip install cython && \
/home/web/.venv/default/bin/pip install cherrypy==3.6.0 pyopenssl mako psycopg2 python-memcached sqlalchemy && \
apt-get autoclean -y && \
apt-get autoremove -y && \
chown -R web.web /home/web/.venv

USER web
WORKDIR /home/web
ENV PYTHONPATH /home/web/webapp

COPY webapp /home/web/webapp

ENTRYPOINT ["/home/web/.venv/default/bin/cherryd", "-i", "server"]

Create a directory and save the content above into a file named Dockerfile. Create a subdirectory called webapp and store your server.py module into it.

Now, build the image as follow:

$ docker build -t lawouach/mywebapp:latest .

Use whatever tag suits you. Then, you can run a container like this:

$ docker run -p 9090:8080 -i -t lawouach/mywebapp:latest

That’s it! A docker container running your CherryPy application.

In the next articles, I will explore various options to use docker in a web application context. Follow ups will also include an introduction to weave and coreos to clusterize your CherryPy application.

In the meantime, do enjoy.

Robot Framework and Sphinx: A suitable toolset for your specification by example

At work, we have been using Robot Framework for all kinds of tests for a few years now and it’s proven to be the good choice. Robot Framework’s simple syntax and grammar does not scare testers away (usually). At the same time, its design makes it easy to support complex use cases as well as simple ones through the power of the Python programming language.

One blind spot however, in my opinion anyway, is the way Robot Framework let you document your tests. It provides a section for this, with basic HTML support but it has always felt limited and not really friendly.

Luckily, in the recent releases, the Robot Framework developers have provided a built-in support for reStructuredText. Not that the documentation section supports this syntax, but instead, you can embed Robot Framework tests into a reStructuredText document, and therefore into Sphinx as well.

The gain isn’t so much visible in the Robot Framework reports since the reStructuredText sections won’t appear in those, but it means you can generate HTML documents which embed executable tests. Fans of doctests will be in known territory.

I think this is a powerful combination as it bridges the tests with the specifications and ensure they are both kept locally at the same place, imrpoving their chance to stay synchronised.  In my mind, it provides a great framework to follow the Specification by Example that Gojko Adzic described so eloquently.

Here is a simple:

Finally, a related powerful extension provides a simple mechanism to include Robot Framework tests into Sphinx documentation. We use it extensively at work as we wanted to keep our tests outside in distinct files without losing the ability to see them embedded into the generated HTML documentation.

CherryPy documentation new start

Early on this year, a discussion emerged on the CherryPy mailing-list about the project. Most people said they loved the project but had struggled with its documentation. Though rich and extensive, it was felt it left down the project somehow by being not designed in a way that was attractive to new comers. I took upon myself to rewrite it from scratch following some ideas exchanged on the mailing-list.

The general expressed wish was to make it friendliers to people starting with the framework whilst making easy to look for common tasks and patterns. This suited me well as I wanted to carry the work I started on the various recipes I keep on BitBucket.

Eventually, I quickly wrote a set of tutorials to guide people through the general layout of a CherryPy application. Then I developed upon the recipes idea by going through many of the most recurrent questions we have on the mailing-list. Finally, I wrote an extensive section regarding the core features of the framework: plugins, tools, the bus, the dispatchers, etc. Those features are seldom used to their best even though they provide a very powerful backbone to design your application in clean way.

The documentation is now online and seems to have been well-received. It will need to be completed but I believe they already make the project much more appealing and fun to work with.

Having fun with WebSocket and Canvas

Recently, I was advised that WebFaction had added support for WebSocket in their custom applications by enabling the according nginx module in their frontend. Obviously, I had to try it out with my own websocket library: ws4py.

Setting up your WebFaction application

Well, as usual with WebFaction, setting up the application is dead simple. Only a few clicks from their control panel.

Create a Custom application and select websocket. This will provide you with a port that your backend will be bound to. And voilà.

Now, your application is created but you won’t yet be able to connect a websocket client. Indeed, you must associate a domain or subdomain with that application.

It is likely your application will be used from a javascript connector in living in a browser, which means, you will be bound by the browser same-origin security model. I would therefore advise you to carefully consider your sub-domain and URL strategies. Probably something along:

  • http://yourhost/ : for the webapp
  • http://yourhost/ws : as the base url for all websocket endpoints

This is just a suggestion of course but this will make it easier for your deployment to follow a simple strategy like this one.

In the WebFaction control panel, create a website which associates your web application with the domain (your webapp can be anything you need to). Associate then your custom websocket application with the same domain but a different path segment. Again, by sharing the same domain, you’ll avoid troubles regarding working around the same-origin security model. I would probably advise as well that you enable SSL but it’s up to you to make that decision.

Once done, you will have now a configured endpoint for your websocket application.

The Custome websocket application will forward all requests to you so that you can run your web and websocket apps from that single port. This is what the demo below allows itself doing. I would recommend that you run two different application processes, one for your webapp and another one for your websocket endpoint. Be antifragile.

Drawing stuff collaboratively

I setup a small demo (sorry self-signed certificate) to demonstrate how you can use HTML5 canvas and websocket features to perform collaborative tasks across various connected clients.

That demo runs a small webapp that also enables a websocket endpoint. When you are on the drawing board, everything you draw is sent to other connected clients so that their view reflects what’s happening on yours. Obviously this goes in any way frm any client to any other clients.

The demo is implemented using ws4py hosted within a CherryPy application. Drawing events are serialized into a json structure and sent over to the server which dispatches them to all participants of that board, and only that board (note, this demo doesn’t validate the content before dispatching back so please conservative with whom you share your board with).

Open the link found in the demo page and share it on as many browsers as you can (including your mobile device). Starting drawing from one device will make all other devices been drawn onto simultaneously and synchronously. Note that the board stays available for up to 5mn only and will disconnect all participants then.

The source code for the demo is located here.

Some feedback…

Let me share a bit of feedback about the whole process.

  • WebSockets are finally a reality. Unless you’re running an old browser or old mobile platform, RFC6455 is available to you.  This means, you can really leverage the power of push from the server. Mind you, you might want to look at Server-Side Event  as well.
  • There isn’t yet a clear understanding on how to properly configure your server resources. In my demo, the whole webapp also hosts the websocket app but this is probably not a good idea if you have a large amount of connected clients or intensive work done server side. Even though the initial connection is initiated from a HTTP request, I would suggest the websocket server is disconnected from the HTTP server process.
  • Security wise, I would suggest you follow the usual principles of validating that any data coming through before you process or dispatch them back.
  • WebFaction supports for websocket is dead easy to setup and fast (at least, since my demo is hosted in Europe and I live in France, I almost cannot see any delay). I would consider their performances good enough to support some really funky real-time applications.
  • jCanvas is really useful to unify your canvas operations.  For this demo, it’s been a blessing.
  • Device motion events are low level and you need to do a lot of leg work to actually make sense of them. This demo is probably not making a really good use of them.
  • There seems to be no universal way to detect that you are running on a mobile device. Go figure.

Next, I wouldn’t mind adding websocket to that fun demo from the mozilla developer network.

The joy of distributing Python packages for Python 2 and 3

I released ws4py 0.3.4 this weekend and although I had integrated support for Python 2 and 3 for a long time now, I ran into a challenge I had quite missed. Indeed, until now my Python 3 support had mainly been concerned about string handling and various compatibility modules. This has proven to work very well and avoided having to rely on external packages such as six.

However, in the past few weeks and I added asyncio support to ws4py and therefore introduced the newly yield from statement. Of course, this isn’t tolerated by Python 2 which complains with a well deserved SyntaxError.

The issues however is that I wished to distribute the same source code with a single source distribution archive. Initially, I had written a function that was preventing modules using that statement to actually be packaged. However, this was rather daft since that, if it weren’t packaged, it wouldn’t be distributed either. Next, I decided to some of setuptools magic. Well, it didn’t help since that’s not what it’s there for anyhow. At this stage, I should say: Don’t simply copy/paste. It will do no good.

Finally, I opted for a fairly simple solution. I knew that when a package is installed from a distribution packages, it is obviously built first. I therefore had to act after Python modules had been gathered but before they would be built. After briefly browsing through distutils source code, I found where I would perform surgery: the find_package_modules of the distutils.command.build_py.build_py class. The nice aspect of this solution is that the source distribution contains indeed all the modules, whether they aim Python 2 or 3 but it’s only when installed that the appropriate modules will be selected and built.

Am I doing it wrong? Is there a cleaner, nicer more pythonic way? If so, please let me know. If not, I hope this may help others that want a simple solution to handle their Python 2 and 3 modules in a single baseline.

“Robot Framework Test Automation” book review

From time to time PacktPub will request a book review of one of their Python-related titles. This time around it was regarding their “Robot Framework Test Automation” book they recently released. Since I’ve been using this awesome acceptance testing tool at work for more than two years, I was happy to comply.

In a nutshell, Robot Framework provides a great interface that acts as the middle-man between variour stakeholders. Indeed, tests are written in plain text (though other formats are supported, I never use them) with a rather minimal set of rules making it (almost) straightforward to read even by non-technical persons. The dirty technical details being hidden away and implemented in Python and executable in one of the various Python VM (CPython, Jython, IronPython are supported out of the box).

Most of the time, the basics of the Robot Framework data model and workflow can be taught in a couple of hours. However, being efficient with it will take a little more time. Still, people don’t have to learn a complete programming language (Python) itself and that’s a relief meaning they are happy to work with Robot Framework sometimes cumbersome syntax.

In spite of having a rather extensive documentation available online, the project did lack a good, straight to the point summary that takes you by the hand. Moreover, the documentation’s style of the project is fairly dry and Unix-style making it tedious to browse sometimes. Still, the content is there and it rarely failed me. With that said, having a friendly book on the subject is a great thing. Kudos to PacktPub. Now about the book…

The good

The book provides an introduction to the tool, its most common usages and even tries to guide you getting more from it. It’s a short book, 83 pages, that will not bore you with complex details. In other words, it’s a good companion of the online documentation if you start with Robot Framework.

Sumit Bisht, the author, does a good job keeping a neutral point of view in regards to how you should use Robot Framework. Indeed, depending on your software under test, you might want to have a more data-oriented approach (ala fitness), a behavior-driven testing approach or even a more assert-oriented style. Not many software can deal with all of them equally and it depends also on how testing is perceived in your organisation. Robot Framework can cope with all of them.

The bad

Though I could understand it’s only an introduction, it feels like some concepts are not properly explored. The idea behind keywords, the internal data model, dynamic libraries, etc. In other words, you will not really understand the underlying blocks and axioms that are the pedestal of the whole tool, you’ll rather learn the basics of using it. In fact, the only section where the book goes into more technical details (with a good example on using sikuli) will probably confuse you since it failed to properly introduce the principles behind them.

The ugly

There isn’t anything particulary that bad with this book, again it should be considered as a friendly introduction. I do not agree with a few minor points Sumit makes but they hardly matter and aren’t wrong anyway, just a matter of opinion. Note also that the book lacks examples a couple of times where it would have mattered but I don’t believe this makes the book any less useful.

The only thing that annoys me really is that PacktPub book’s layout still looks so unprofesionnal. They should really make an effort as the code is, most of the time, too hard to read (actually on this one item, it wasn’t that bad).

Final note

I think this book is ideal if you are about to start with Robot Framework as it will speed up the basics. If you’re already used to the tool, I am not sure it will help very much.

 

 

 

ws4py – WebSocket client and server library for Python

Recently I released ws4py, a package that provides client and server WebSocket support for Python 2.6 and 2.7.

Let’s first have a quick overview of what ws4py offers for now:

  • WebSocket specification draft-10 of the current specification.
  • A threaded client. This gives a simple client that doesn’t require an external dependency.
  • A Tornado client. This client is based on Tornado 2.0 which is quite a popular way of running asynchronous networking code these days. Tornado provides its own server implementation so I didn’t include mine in ws4py.
  • A CherryPy extension so that you can integrate WebSocket from within your CherryPy 3.2.1 server.
  • A gevent server based on the popular gevent library. This is courtesy of Jeff Lindsay.
  • Based on Jeff’s work, a pure WSGI middleware as well (available in the current master branch only until the next release).
  • ws4py runs on Android devices thanks to the SL4A package

Hopefully more client and servers will be added along the way as well as Python 3.x support. The former should be rather simple to add due to the way I designed ws4py.

The main idea is to make a distinction between the bytes provider and the bytes processing. The former is essentially reading and writing bytes from the connected socket. The latter is the function of making something out of the received bytes based on the WebSocket specification. In most implementations I have seen so far, both are rather heavily intertwined making it difficult to use a different bytes provider.

ws4py tries a different path by relying on a great feature of Python: the possibility to send data back to a generator. For instance, the frame parsing yields the quantity of bytes each time it needs more and the caller feeds back the generator those bytes once they are received. In fact, the caller of a frame parser is a stream object which acts the same way. The caller of that stream object is in fact the bytes provider (a client or a server). The stream is in charge of aggregating frames into a WebSocket message. Thanks to that design, both the frame and stream objects are totally unaware of the bytes provider and can be easily adapted in various contexts (gevent, tornado, CherryPy, etc.).

On my TODO list for ws4py:

  • Upgrade to a more recent version of the specification
  • Python 3.x implementation
  • Better documentation, read, write documentation.
  • Better performances on very large WebSocket messages

Acceptance testing a CherryPy application with Robot Framework

I recently received the Python Testing Cookbook authored by Greg L. Turnquist and was happy to read about recipes on acceptance testing using Robot Framework. We’ve been using this tool at work for a few weeks now with great results. Greg shows how to test a web application using the Selenium Library extension for Robot Framework and I thought it’d be fun to demonstrate how to test a CherryPy application following his recipe. So here we go.

First some requirements:

$ mkvirtualenv --distribute --no-site-packages --unzip-setuptools acceptance
(acceptance)$ pip install cherrypy
(acceptance)$ pip install robotframework
(acceptance)$ pip install robotframework-seleniumlibrary

Let’s define a simple CherryPy application, which displays a input text where to type a message. When the submit button is pressed, the message is sent to the server and returned as-is. Well it’s an echo message really.

import cherrypy
 
__all__ = ['Echo']
 
class Echo(object):
    @cherrypy.expose
    def index(self):
        return """<html>
<head><title>Robot Framework Test for CherryPy</title></head>
<body>
<form method="post" action="/echo">
<input type="text" name="message" />
<input type="submit" />
</form>
</body>
</html>"""
 
    @cherrypy.expose
    def echo(self, message):
        return message
 
if __name__ == '__main__':
    cherrypy.quickstart(Echo())

Save the code above in a module named myapp.py

Next, we create an extension to Robot Framework that will manage CherryPy. Save the following in a module CherryPyLib.py. It’s important to respect that name since Robot Framework expects the module and its class to match in names.

import imp
import os, os.path
 
import cherrypy
 
class CherryPyLib(object):
    def setup_cherrypy(self, conf_file=None):
        """
        Configures the CherryPy engine and server using
        the built-in 'embedded' environment mode.
 
        If provided, `conf_file` is a path to a CherryPy
        configuration file used in addition.
        """
        cherrypy.config.update({"environment": "embedded"})
        if conf_file:
            cherrypy.config.update(conf_file)            
 
    def start_cherrypy(self):
        """
        Starts a CherryPy engine.
        """
        cherrypy.engine.start()
 
    def exit_cherrypy(self):
        """
        Terminates a CherryPy engine.
        """
        cherrypy.engine.exit()
 
    def mount_application(self, appmod, appcls, directory=None):
        """
        Mounts an application to be tested. `appmod` is the name
        of a Python module containing `appcls`. The module is
        looked for in the given directory. If not provided, we use
        the current one instead.
        """
        directory = directory or os.getcwd()
        file, filename, description = imp.find_module(appmod, [directory])
        mod = imp.load_module(appmod, file, filename, description)
        if hasattr(mod, appcls):
            cls = getattr(mod, appcls)
            app = cls()
            cherrypy.tree.mount(app)
        else:
            raise ImportError, "cannot import name %s from %s" % (appcls, appmod)

Note that we start and stop the CherryPy server during the test itself, meaning you don’t need to start it separately. Pure awesomeness.

Finally let’s write a straightforward acceptance test to validate the overall workflow of echoing a message using our little application.

***Settings***
Library	SeleniumLibrary
Library	CherryPyLib
Suite Setup	Start Dependencies
Suite Teardown	Shutdown Dependencies
Test Setup	Mount Application	myapp	Echo

***Variables***
${MSG}	Hello World
${HOST}	http://localhost:8080/

***Test Cases***
Echo ${MSG}
     Open Browser	${HOST}
     Input text		message		${MSG}
     Submit form
     Page Should Contain		${MSG}
     Close All Browsers

***Keywords***
Start Dependencies
    Setup Cherrypy
    Start CherryPy
    Start Selenium Server
    Sleep 	3s

Shutdown Dependencies
    Stop Selenium Server
    Exit CherryPy

Save the test above into a file named testmyapp.txt. You can finally run the test as follow:

(acceptance)$ pybot --pythonpath . testmyapp.txt

This will start CherryPy, Selenium’s proxy server and Firefox within which the test case will be run. Easy, elegant and powerful.

Hosting a Django application on a CherryPy server

Recently at work I’ve had the requirement to host a Django application in a CherryPy server. I first looked for various projects I knew were doing just that. Unfortunately, after trying them I was rather disapointed. Their approach is to provide a command similar to the famous Django runserver‘s one but I’ve found it to be more complex than necessary. So I wrote my own module that performs those operations by staying much closer to how CherryPy does work, most specifically by using the process bus coming with CherryPy.

I’m sharing a stripped down version of the module I wrote which shows how one could host a Django application in a CherryPy server. Hopefully this might help some of you.

# Python stdlib imports
import sys
import logging
import os, os.path
 
# Third-party imports
import cherrypy
from cherrypy.process import wspbus, plugins
from cherrypy import _cplogging, _cperror
from django.conf import settings
from django.core.handlers.wsgi import WSGIHandler
from django.http import HttpResponseServerError
 
class Server(object):
    def __init__(self):
        self.base_dir = os.path.join(os.path.abspath(os.getcwd()), "cpdjango")
 
        conf_path = os.path.join(self.base_dir, "..", "server.cfg")
        cherrypy.config.update(conf_path)
 
        # This registers a plugin to handle the Django app
        # with the CherryPy engine, meaning the app will
        # play nicely with the process bus that is the engine.
        DjangoAppPlugin(cherrypy.engine, self.base_dir).subscribe()
 
    def run(self):
        engine = cherrypy.engine
        engine.signal_handler.subscribe()
 
        if hasattr(engine, "console_control_handler"):
            engine.console_control_handler.subscribe()
 
        engine.start()
        engine.block()
 
class DjangoAppPlugin(plugins.SimplePlugin):
    def __init__(self, bus, base_dir):
        """
        CherryPy engine plugin to configure and mount
        the Django application onto the CherryPy server.
        """
        plugins.SimplePlugin.__init__(self, bus)
        self.base_dir = base_dir
 
    def start(self):
        self.bus.log("Configuring the Django application")
 
        # Well this isn't quite as clean as I'd like so
        # feel free to suggest something more appropriate
        from cpdjango.settings import *
        app_settings = locals().copy()
        del app_settings['self']
        settings.configure(**app_settings)
 
        self.bus.log("Mounting the Django application")
        cherrypy.tree.graft(HTTPLogger(WSGIHandler()))
 
        self.bus.log("Setting up the static directory to be served")
        # We server static files through CherryPy directly
        # bypassing entirely Django
        static_handler = cherrypy.tools.staticdir.handler(section="/", dir="static",
                                                          root=self.base_dir)
        cherrypy.tree.mount(static_handler, '/static')
 
class HTTPLogger(_cplogging.LogManager):
    def __init__(self, app):
        _cplogging.LogManager.__init__(self, id(self), cherrypy.log.logger_root)
        self.app = app
 
    def __call__(self, environ, start_response):
        """
        Called as part of the WSGI stack to log the incoming request
        and its response using the common log format. If an error bubbles up
        to this middleware, we log it as such.
        """
        try:
            response = self.app(environ, start_response)
            self.access(environ, response)
            return response
        except:
            self.error(traceback=True)
            return HttpResponseServerError(_cperror.format_exc())
 
    def access(self, environ, response):
        """
        Special method that logs a request following the common
        log format. This is mostly taken from CherryPy and adapted
        to the WSGI's style of passing information.
        """
        atoms = {'h': environ.get('REMOTE_ADDR', ''),
                 'l': '-',
                 'u': "-",
                 't': self.time(),
                 'r': "%s %s %s" % (environ['REQUEST_METHOD'], environ['REQUEST_URI'], environ['SERVER_PROTOCOL']),
                 's': response.status_code,
                 'b': str(len(response.content)),
                 'f': environ.get('HTTP_REFERER', ''),
                 'a': environ.get('HTTP_USER_AGENT', ''),
                 }
        for k, v in atoms.items():
            if isinstance(v, unicode):
                v = v.encode('utf8')
            elif not isinstance(v, str):
                v = str(v)
            # Fortunately, repr(str) escapes unprintable chars, \n, \t, etc
            # and backslash for us. All we have to do is strip the quotes.
            v = repr(v)[1:-1]
            # Escape double-quote.
            atoms[k] = v.replace('"', '\\"')
 
        try:
            self.access_log.log(logging.INFO, self.access_log_format % atoms)
        except:
            self.error(traceback=True)
 
if __name__ == '__main__':
    Server().run()

You can find the code along side a minimal Django application showing how this works here (BSD licence). I used Django 1.3 to generate a default project but the code above works well with older version of Django.

Edit 16/03/2012: Thanks to Damien Tougas, I’ve wrapped up a better recipe for hosting a Django application into a CherryPy application server.

Should we assess OpenData innovation and impacts?

Yesterday, I was at a talk titled: “OpenData, basis for a new political technologies ?” at la Cantine Numérique Rennaise [fr], a place about the digital age located in Rennes. During the debate, I asked how we could assess the impacts of OpenData without some sort of measuring instruments. This is question the EU asked itself in a recent report.

Xavier Crouan, who has been digital innovative and Informations director in Rennes for the past few years and has communicated extensively on OpenData, made a comment that I felt was a misunderstanding of my own question. He roughly stated that it felt typically French to request for tools, indicators whenever risks and innovation were taken. He believed this was saddening to hear French engineers being so grounded and felt innovation should not have to justify itself.

Honestly that wasn’t what I was leading at. The discussion at that point of the debate was about how OpenData would eventually make a difference in people’s life politically as much as economically. In that context, it seemed sensible to ask how we could measure the impacts of OpenData so that we could tweak, tune, improve its usage.

Now in regards to innovation itself, I believe you usually need simple indicators to gauge whether or not you’re walking onto a fruitful path.

For instance, Rennes has held a contest for building applications on data it has recently opened. Xavier Crouan has indicated that 2000 people had voted. One might consider it is an indicator whether or not the contest was publicly a success and if not, how to tune it if there’s another contest next year.

Shooting in different directions in hope one path will lead to strong innovation is shortsighted in my book. You need to define a few criteria that will assess how each direction fares. This is what OpenData promotes too: improving efficiency in reusing of public sector data.

Innovation is not incompatible with retrospective.