Tag Archives: python

Deploying a docker container of a CherryPy application onto a CoreOS cluster

Previously, I presented a simple web application that was distributed into several docker containers. In this article, I will be introducing the CoreOS platform as the backend for clusterizing a CherryPy application.

CoreOS quick overview

CoreOS is a Linux distribution designed to support distributed/clustering scenarios. I will not spend too much time explaining it here as their documentation already provides lots of information. Most specifically, review their architecture use-cases for a good overview of what CoreOS is articulated.

What matters to us in this article is that we can use CoreOS to manage a cluster of nodes that will host our application as docker containers. To achieve this, CoreOS relies on a technologies such as systemd, etcd and fleet at its core.

Each CoreOS instance within the cluster runs a linux kernel which executes systemd to manage processes within that instance. etcd is a distributed key/value store used across the cluster to enable service discovery and configuration synchronization within the cluster. Fleet is used to manage services executed within your cluster. Those services aredescribed in files called unit files.

Roughly speaking, you use a unit-file to describe your service and specify which docker container to execute. Using fleet, you submit and load that service to the cluster before starting/stopping it at will. CoreOS will determine which host it will deploy it on (you can setup constraints that CoreOS will follow). Once loaded onto a node, the node’s systemd takes over to manage the service locally and you can use fleet to query the status of that service from outside.

Setup your environment with Vagrant

Vagrant is a nifty tool to orchestrate small deployment on your development machine. For instance, here is a simple command to create a node with Ubuntu running on it:

$ vagrant up ubuntu/trusty64 --provider virtualbox

Vagrant has a fairly rich command line you can script to generate a final image. However, Vagrant usually provisions virtual machines by following a description found within a simple text file (well actually it’s a ruby module) called a Vagrantfile. This is the path we will be following in this article.

Let’s get the code:

$ hg clone https://bitbucket.org/Lawouach/cherrypy-recipes
$ cd cherrypy-recipes/deployment/container/vagrant_webapp_with_load_balancing

From there you can create the cluster as follows:

$ eval `ssh-agent -s`
$ export FLEETCTL_TUNNEL=127.0.0.1:2222
$ ./cluster create

I am not using directly vagrant to create the cluster because there are a couple of other operations that must be carried to let fleet talk to the CoreOS node properly. Namely:

  • Generate a new cluster id (via https://discovery.etcd.io/new)
  • Start a ssh agent to handle the node’s SSH identities to connect from the outside
  • Indicate where to locate the node’s ssh service (through a port mapped by Vagrant)
  • Create the cluster (this calls vagrant up internally)

Once completed, you should have a running CoreOS node that you can log into:

$ vagrant ssh core-01

To destroy the cluster and terminate the node:

$ ./cluster destroy

This also takes care of wiping out local resources that we don’t need any longer.

Before moving on, you will need to install the fleet tools.

$ wget https://github.com/coreos/fleet/releases/download/v0.9.0/fleet-v0.9.0-linux-amd64.tar.gz
$ tar zxvf fleet-v0.9.0-linux-amd64.tar.gz
$ export PATH=$PATH:`pwd`/fleet-v0.9.0-linux-amd64

Run your CherryPy application onto the cluster

If you have destroyed the cluster, re-create it and make sure you can speak to it through fleet as follows:

$ fleetctl list-machines
MACHINE IP METADATA
50f6819c... 172.17.8.101 -

Bingo! This is the public address we statically set in the Vagrantfile associated to the node.

Let’s ensure we have no registered units yet:

$ fleetctl list-unit-files
UNIT HASH DSTATE STATE TARGET

$ fleetctl list-units
UNIT MACHINE ACTIVE SUB

Okay, all is good. Now, let’s push each of our units to the cluster:

$ fleetctl submit units/webapp_db.service
$ fleetctl submit units/webapp_app@.service 
$ fleetctl submit units/webapp_load_balancer.service 

$ fleetctl list-unit-files
UNIT                            HASH    DSTATE          STATE           TARGET
webapp_app@.service             02c0c64 inactive        inactive        -
webapp_db.service               127e44a inactive        inactive        -
webapp_load_balancer.service    e1cfee6 inactive        inactive        -

$ fleetctl list-units
UNIT    MACHINE  ACTIVE  SUB

As you can see, the unit files have been registered but they are not loaded onto the cluster yet.

Notice the naming convention used for webapp_app@.service, this is due to the fact that this is will not be considered as a service description itself but as a template for a named service. We will see this in a minute. Refer to this extensive DigitalOcean article for more details regarding unit files.

Let’s now load each unit onto the cluster:

$ fleetctl load units/webapp_db.service
Unit webapp_db.service loaded on 50f6819c.../172.17.8.101

$ fleetctl list-units
UNIT              MACHINE                  ACTIVE   SUB
webapp_db.service 50f6819c.../172.17.8.101 inactive dead

Here, we asked fleet to load the service onto an available node. Considering there is a single node, it wasn’t a a difficult decision to make.

At that stage, your service is not started. It simply is attached to a node.

$ fleetctl journal webapp_db.service
-- Logs begin at Tue 2015-02-17 19:26:07 UTC, end at Tue 2015-02-17 19:40:49 UTC. --

It is not compulsory to explicitely load before starting a service. However, if gives you the opportunity to unload a service if a specific condition occurs (service needs to be amended, the chosen host isn’t valid any longer…).

Now ce can finally start it:

$ fleetctl start units/webapp_db.service 
Unit webapp_db.service launched on 50f6819c.../172.17.8.101

You can see what’s happening:

$ fleetctl journal webapp_db.service
-- Logs begin at Tue 2015-02-17 19:26:07 UTC, end at Tue 2015-02-17 19:56:28 UTC. --
Feb 17 19:56:19 core-01 docker[1561]: dc55e5f30ff9: Pulling fs layer
Feb 17 19:56:21 core-01 docker[1561]: dc55e5f30ff9: Download complete
Feb 17 19:56:21 core-01 docker[1561]: 835f524d1d7e: Pulling metadata
Feb 17 19:56:22 core-01 docker[1561]: 835f524d1d7e: Pulling fs layer
Feb 17 19:56:24 core-01 docker[1561]: 835f524d1d7e: Download complete
Feb 17 19:56:24 core-01 docker[1561]: cb0503cedddb: Pulling metadata
Feb 17 19:56:25 core-01 docker[1561]: cb0503cedddb: Pulling fs layer
Feb 17 19:56:27 core-01 docker[1561]: cb0503cedddb: Download complete
Feb 17 19:56:27 core-01 docker[1561]: cdd30fd0c6f3: Pulling metadata
Feb 17 19:56:27 core-01 docker[1561]: cdd30fd0c6f3: Pulling fs layer

Or alternatively, you can request the service’s status:

$ fleetctl status units/webapp_db.service 
● webapp_db.service - Notes database
   Loaded: loaded (/run/fleet/units/webapp_db.service; linked-runtime; vendor preset: disabled)
   Active: activating (start-pre) since Tue 2015-02-17 19:55:33 UTC; 1min 25s ago
  Process: 1552 ExecStartPre=/usr/bin/docker rm notesdb (code=exited, status=1/FAILURE)
  Process: 1478 ExecStartPre=/usr/bin/docker kill notesdb (code=exited, status=1/FAILURE)
  Control: 1561 (docker)
   CGroup: /system.slice/webapp_db.service
           └─control
             └─1561 /usr/bin/docker pull lawouach/webapp_db

Feb 17 19:56:31 core-01 docker[1561]: c1eac5e31754: Pulling fs layer
Feb 17 19:56:33 core-01 docker[1561]: c1eac5e31754: Download complete
Feb 17 19:56:33 core-01 docker[1561]: 672ef5050bb9: Pulling metadata
Feb 17 19:56:35 core-01 docker[1561]: 672ef5050bb9: Pulling fs layer
Feb 17 19:56:36 core-01 docker[1561]: 672ef5050bb9: Download complete
Feb 17 19:56:36 core-01 docker[1561]: 7ebc912be04a: Pulling metadata
Feb 17 19:56:37 core-01 docker[1561]: 7ebc912be04a: Pulling fs layer
Feb 17 19:56:52 core-01 docker[1561]: 7ebc912be04a: Download complete
Feb 17 19:56:52 core-01 docker[1561]: 22f2bfe64e7f: Pulling metadata
Feb 17 19:56:52 core-01 docker[1561]: 22f2bfe64e7f: Pulling fs layer

Once the service is ready:

fleetctl status units/webapp_db.service 
● webapp_db.service - Notes database
   Loaded: loaded (/run/fleet/units/webapp_db.service; linked-runtime; vendor preset: disabled)
   Active: active (running) since Tue 2015-02-17 19:57:24 UTC; 2min 46s ago
  Process: 1561 ExecStartPre=/usr/bin/docker pull lawouach/webapp_db (code=exited, status=0/SUCCESS)
  Process: 1552 ExecStartPre=/usr/bin/docker rm notesdb (code=exited, status=1/FAILURE)
  Process: 1478 ExecStartPre=/usr/bin/docker kill notesdb (code=exited, status=1/FAILURE)
 Main PID: 1831 (docker)
   CGroup: /system.slice/webapp_db.service
           └─1831 /usr/bin/docker run --name notesdb -e POSTGRES_PASSWORD=test -e POSTGRES_USER=test -t lawouach/webapp_db:latest

Feb 17 19:57:28 core-01 docker[1831]: backend>
Feb 17 19:57:28 core-01 docker[1831]: PostgreSQL stand-alone backend 9.4.0
Feb 17 19:57:28 core-01 docker[1831]: backend> statement: CREATE USER "test" WITH SUPERUSER PASSWORD 'test' ;
Feb 17 19:57:28 core-01 docker[1831]: backend>
Feb 17 19:57:28 core-01 docker[1831]: ******CREATING NOTES DATABASE******
Feb 17 19:57:28 core-01 docker[1831]: PostgreSQL stand-alone backend 9.4.0
Feb 17 19:57:28 core-01 docker[1831]: backend> backend> backend> ******DOCKER NOTES CREATED******
Feb 17 19:57:28 core-01 docker[1831]: LOG:  database system was shut down at 2015-02-17 19:57:28 UTC
Feb 17 19:57:28 core-01 docker[1831]: LOG:  database system is ready to accept connections
Feb 17 19:57:28 core-01 docker[1831]: LOG:  autovacuum launcher started

Starting a service from a unit template works the same way except you provide an identifier to the instance:

$ fleetctl load units/webapp_app@1.service
$ fleetctl start units/webapp_app@1.service
$ fleetctl status units/webapp_app@1.service 
● webapp_app@1.service - App service
   Loaded: loaded (/run/fleet/units/webapp_app@1.service; linked-runtime; vendor preset: disabled)
   Active: active (running) since Tue 2015-02-17 20:06:40 UTC; 2min 56s ago
  Process: 2031 ExecStartPre=/usr/bin/docker pull lawouach/webapp_app (code=exited, status=0/SUCCESS)
  Process: 2019 ExecStartPre=/usr/bin/docker rm notes%i (code=exited, status=1/FAILURE)
  Process: 2012 ExecStartPre=/usr/bin/docker kill notes%i (code=exited, status=1/FAILURE)
 Main PID: 2170 (docker)
   CGroup: /system.slice/system-webapp_app.slice/webapp_app@1.service
           └─2170 /usr/bin/docker run --link notesdb:postgres --name notes1 -P -t lawouach/webapp_app:latest

Feb 17 20:06:41 core-01 docker[2170]: [17/Feb/2015:20:06:41] ENGINE Listening for SIGHUP.
Feb 17 20:06:41 core-01 docker[2170]: [17/Feb/2015:20:06:41] ENGINE Listening for SIGTERM.
Feb 17 20:06:41 core-01 docker[2170]: [17/Feb/2015:20:06:41] ENGINE Listening for SIGUSR1.
Feb 17 20:06:41 core-01 docker[2170]: [17/Feb/2015:20:06:41] ENGINE Bus STARTING
Feb 17 20:06:41 core-01 docker[2170]: [17/Feb/2015:20:06:41] ENGINE Starting up DB access
Feb 17 20:06:41 core-01 docker[2170]: [17/Feb/2015:20:06:41] ENGINE Setting up Mako resources
Feb 17 20:06:41 core-01 docker[2170]: [17/Feb/2015:20:06:41] ENGINE Started monitor thread 'Autoreloader'.
Feb 17 20:06:41 core-01 docker[2170]: [17/Feb/2015:20:06:41] ENGINE Started monitor thread '_TimeoutMonitor'.
Feb 17 20:06:42 core-01 docker[2170]: [17/Feb/2015:20:06:42] ENGINE Serving on http://0.0.0.0:8080
Feb 17 20:06:42 core-01 docker[2170]: [17/Feb/2015:20:06:42] ENGINE Bus STARTED

The reason I chose 1 as the identifier is so that it the container’s name becomes notes1 as expected by the load-balancer container when linking it to the application’s container. As described in the previous article.

Start a second instance of that unit template:

$ fleetctl load units/webapp_app@2.service
$ fleetctl start units/webapp_app@2.service

That second instance starts immediatly because the image is already there.

Finally, once both services are marked as “active”, you can start the load-balancer service as well:

$ fleetctl start units/webapp_load_balancer.service 
$ fleetctl status units/webapp_load_balancer.service 
● webapp_load_balancer.service - Load Balancer service
   Loaded: loaded (/run/fleet/units/webapp_load_balancer.service; linked-runtime; vendor preset: disabled)
   Active: active (running) since Tue 2015-02-17 20:10:21 UTC; 1min 51s ago
  Process: 2418 ExecStartPre=/usr/bin/docker pull lawouach/webapp_load_balancer (code=exited, status=0/SUCCESS)
  Process: 2410 ExecStartPre=/usr/bin/docker rm notes_loadbalancer (code=exited, status=1/FAILURE)
  Process: 2403 ExecStartPre=/usr/bin/docker kill notes_loadbalancer (code=exited, status=1/FAILURE)
 Main PID: 2500 (docker)
   CGroup: /system.slice/webapp_load_balancer.service
           └─2500 /usr/bin/docker run --link notes1:n1 --link notes2:n2 --name notes_loadbalancer -p 8090:8090 -p 8091:8091 -t lawouach/webapp_load_balancer:latest

Feb 17 20:10:14 core-01 docker[2418]: 9284a1282362: Download complete
Feb 17 20:10:14 core-01 docker[2418]: d53024a13d34: Pulling metadata
Feb 17 20:10:15 core-01 docker[2418]: d53024a13d34: Pulling fs layer
Feb 17 20:10:17 core-01 docker[2418]: d53024a13d34: Download complete
Feb 17 20:10:17 core-01 docker[2418]: 45e1cf959053: Pulling metadata
Feb 17 20:10:18 core-01 docker[2418]: 45e1cf959053: Pulling fs layer
Feb 17 20:10:21 core-01 docker[2418]: 45e1cf959053: Download complete
Feb 17 20:10:21 core-01 docker[2418]: 45e1cf959053: Download complete
Feb 17 20:10:21 core-01 docker[2418]: Status: Downloaded newer image for lawouach/webapp_load_balancer:latest
Feb 17 20:10:21 core-01 systemd[1]: Started Load Balancer service.

At that stage, the complete application is up and running and you can go to http://localhost:7070/ to use it. Port 7070 is mapped to port 8091 by vagrant within our Vagrantfile.

No such thing as a free lunch

As I said earlier, we created a cluster of one node on purpose. Indeed, the way all our containers are able to dynamically know where to locate each other is through the linking mechanism. Though this works very well in simple scenarios like this one, this has a fundamental limit since you cannot link across different hosts. If we had multiple nodes, fleet would try distributing our services accross all of them (unless we decided to constraint this within the unit files) and this would break the links between them obviously. This is why, in this particular example, we create a single node’s cluster.

Docker provides a mechanism named ambassador to address this restriction but we will not review it, instead we will benefit from a flat sub-network topology provided by weave as it seems it follows a more traditional path than the docker’s linking approach. This will be the subject of my next article.

A more concrete example of a complete web application with CherryPy, PostgreSQL and haproxy

In the previous post, I described how to setup a docker image to host your CherryPy application. In this installment, I will present a complete – although simple – web application made of a database, two web application servers and a load-balancer.

Setup a database service

We are going to create a docker image to host our database instance, but because we are lazy and because it has been done already, we will be using an official image of PostgreSQL.

$ docker run --name webdb -e POSTGRES_PASSWORD=test -e POSTGRES_USER=test -d postgres

As you can see, we run the official, latest, PostgreSQL image. By setting the POSTGRES_USER and POSTGRES_PASSWORD, we make sure the container creates the according account for us. We also set a name for this container, this will be useful when we link to it from another container as we will see later on.

A word of warning, this image is not necessarily secure. I would advise you to consider this question prior to using it in production.

Now that the server is running, let’s create a database for our application. Run a new container which will execute the psql shell:

$ docker run -it --link webdb:postgres --rm postgres sh -c 'exec psql -h "$POSTGRES_PORT_5432_TCP_ADDR" -p "$POSTGRES_PORT_5432_TCP_PORT" -U test'
 Password for user test:
 psql (9.4.0)
 Type "help" for help.
 test=# CREATE DATABASE notes;
 CREATE DATABASE
 test=# \c notes \dt
 You are now connected to database "notes" as user "test".
 List of relations
 Schema | Name | Type | Owner
 --------+------+-------+-------
 public | note | table | test
 (1 row)
 notes=#

We have connected to the server, we then create the “notes” database and connect to it.

How did this work? Well, the magic happens through the –link wedb:postgres we provided to the run command. This tells the new container we are linking to a container named webdb and that we create an alias for it inside that new container. That alias is used by docker to initialize a few environment variables such as:

POSTGRES_PORT_5432_TCP_ADDR: 
   the IP address of the linked container
POSTGRES_PORT_5432_TCP_PORT: 
   the exposed port 5432 (which is quite obviously the server's port)

Notice the POSTGRES_ prefix? This is exactly the alias we gave in the command’s argument. This is the mechanism by which you will link your containers so that they can talk to each other.

Note that there are alternatives, such as weave, that may be a little more complex but probably more powerful. Make sure to check them out at some point.

Setup our web application service

We are going to run a very basic web application. It will be a form to take notes. The application will display them and you will be able to delete each note. The notes are posted via javascript through a simple REST API. Nothing fancy. Here is a screenshot for you:

notes_screen

By the way, the application uses Yahoo’s Pure.css framework to change from bootstrap.

Simply clone the mercurial repository to fetch the code.

$ hg clone https://Lawouach@bitbucket.org/Lawouach/cherrypy-recipes
$ cd cherrypy-recipes/deployment/container/webapp_with_load_balancing/notesapp
$ ls
Dockerfile webapp

This will download the whole repository but fear not, it’s rather lightweight. You can review the Dockerfile which is rather similar to what was described in my previous post. Notice how we copy the webapp subdirectory onto the image.

We can now create our image from that directory:

$ docker build -t lawouach/webapp:latest .

As usual, change the tag to whatever suits you.

Let’s now run two containers from that image:

$ docker run --link webdb:postgres --name notes1 --rm -p 8080:8080 -i -t lawouach/webapp:latest
$ docker run --link webdb:postgres --name notes2 --rm -p 8081:8080 -i -t lawouach/webapp:latest

We link those two containers with the container running our database. We can therefore use that knowledge to connect to the database via SQLAlchemy. We also publish the application’s port to two distinct ports on the host. Finally, we name our containers so that can we reference them in the next container we will be creating.

At this stage, you ought to see that your application is running by going either to http://localhost:8080/ or http://localhost:8081/.

Setup a load balancer service

Our last service – microservice should I say – is a simple load-balancer between our two web applications. To support this feature, we will be using haproxy. Well-known, reliable and lean component for such a task.

$ cd cherrypy-recipes/deployment/container/webapp_with_load_balancing/load_balancing
$ ls
Dockerfile haproxy.cfg

Tak some time to review the Dockerfile. Notice how we copy the local haproxy.cfg file as the configuration for our load-balancer. Build your image like this:

$ docker build -t lawouach/haproxy:latest .

And now run it to start load balancing between your two web application containers:

$ docker run --link notes1:n1 --link notes2:n2 --name haproxy -p 8090:8090 -p 8091:8091 -d -t lawouach/haproxy:latest

In this case, we will be executing the container in the background because we are blocking on haproxy and it won’t lok to the console anyway.

Notice how we link to both web application containers. We set short alias just by pure lazyness. We publish two ports to the host. The 8090 port will be necessary to access the stats page of the haproxy server itself. The 8091 port will be used to access our application.

To understand how we reuse the the aliases, please refer to the the haproxy.cfg configuration. More precisely to those two lines:

server notes1 ${N1_PORT_8080_TCP_ADDR}:${N1_PORT_8080_TCP_PORT} check inter 4000
server notes2 ${N2_PORT_8080_TCP_ADDR}:${N2_PORT_8080_TCP_PORT} check inter 4000

We load-balance between our two backend servers and we do not have to know their address at the time when we build the image, but only when the container is started.

That’s about it really. At this stage, you ought to connect to http://localhost:8091/ to see use your application. Each request will be sent to each web application’s instances in turn. You may check the status of your load-balancing by connecting to http://localhost:8090/.

Obviously, this just a basic example. For instance, you could extend it by setting another service to manage your syslog and configure haproxy to send its log to it.

Next time, we will be exploring the world of CoreOS and clustering before moving on to service and resource management via Kubernetes and MesOS.

Create a docker container for your CherryPy application

In the past year, process isolation through the use of containers has exploded and you can find containers for almost anything these days. So why not creating a container to isolate your CherryPy application from the rest of the world?

I will not focus on the right and wrongs in undertaking such a task. This is not the point of this article. On the other hand, this article will guide you through the steps to create a base container image that will support creating per-project images that can be run in containers.

We will be using docker for this since it’s the hottest container technology out there. It doesn’t mean it’s the best, just that it’s the most popular which in turns means there is high demand for it. With that being said, once you have decided containers are a relevant feature to you, I encourage you to have a look at other technologies in that field to draw your own conclusion.

Docker uses various Linux kernel assets to isolate a process from the other running processes. In particular, it uses control groups to constraints the resources used by the process. Docker also makes the most of namespaces which create an access layer to resources such as network, mounted devices, etc.

Basically, when you use docker, you run an instance of an image and we call this a container. An image is mostly a mille-feuille of read-only layers that are eventually unified into one. When an image is run as a container, an extra read-write layer is added by docker so that you can make changes at runtime from within your container. Those changes are lost everytime you stop the running container unless you commit it into a new image.

So how to start up with docker?

Getting started

First of all, you must install docker. I will not spend much time here explaining how to go about it since the docker documentation does it very well already. However, iI encourage you to:

  • install from the docker repository as it’s more up to date usually than official distribution repositories
  • ensure you can run docker commands as a non-root user. This will make your daily usage of docker much easier

At the time of this writing, docker 1.4.1 is the latest version and this article was written using 1.3.3. Verify your version as follow:

$ docker version
Client version: 1.3.3
Client API version: 1.15
Go version (client): go1.3.3
Git commit (client): d344625
OS/Arch (client): linux/amd64
Server version: 1.3.3
Server API version: 1.15
Go version (server): go1.3.3
Git commit (server): d344625

Docker command interface

Docker is an application often executed as a daemon. To interact with it you use the command line interface via the docker command. Simply run the following command to see them:

$ docker

Play a little with docker

Before we move on creating our docker image for a CherryPy application, lets play with docker.

The initial step is to pull an existing image. Indeed, you will likely not create your own OS image from scratch. Instead, you will use a public base image, available on the docker public registry. During the course of these articles, we will be using a Ubuntu base image. But everything would work the same wth Centos or something else.

$ docker pull ubuntu
Pulling repository ubuntu
8eaa4ff06b53: Download complete 
511136ea3c5a: Download complete 
3b363fd9d7da: Download complete 
607c5d1cca71: Download complete 
f62feddc05dc: Download complete 
Status: Downloaded newer image for ubuntu:latest

Easy right? The various downloads are those of the intermediary images that were generated by the Ubuntu image maintainers. Interestingly, this means you could start your image from any of those images.

Now that you have an image, you may wish to list all of them on your machine:

$ docker images
REPOSITORY TAG IMAGE ID CREATED VIRTUAL SIZE
ubuntu latest 8eaa4ff06b53 10 days ago 188.3 MB

Notice that the intermediate images are not listed here. To see them:

$ docker images -a

Note that, in the previous call we didn’t specify any specific version for our docker image. You may wish to do so as follow:

$ docker image ubuntu:14.10
Pulling repository ubuntu
bf49414948ac: Download complete 
511136ea3c5a: Download complete 
a7cca9443999: Download complete 
dbbd544a49e2: Download complete 
98b540cf0569: Download complete 
Status: Downloaded newer image for ubuntu:14.10

Let’s pull a centos image as well for the fun:

$ docker pull centos:7
Pulling repository centos
8efe422e6104: Download complete 
511136ea3c5a: Download complete 
5b12ef8fd570: Download complete 
Status: Image is up to date for centos:7

Let’s now run a container and play around with it:

$ docker run --rm --name playground -i -t centos:7 bash

[root@7d5761d100e4 /]# ls
bin dev etc home lib lib64 lost+found media mnt opt proc root run sbin selinux srv sys tmp usr var

In the previous command, we start a bash command executed within a container using the Centos image tagged 7. We name the container to make it easy to reference it afterwards. This is not compulsory but is quite handy in certain situations. We also tell docker that it can dispose of that container when we exit it. Otherwise, the container will remain.

[root@7d5761d100e4 /]# uname -a
Linux 7d5761d100e4 3.13.0-43-generic #72-Ubuntu SMP Mon Dec 8 19:35:06 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

This is interesting because it shows that, indeed, the container is executed in the host kernel which, in this instance, is my Ubuntu operating system.

Finally below, let’s see the network configuration:

[root@7d5761d100e4 /]# ip addr show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN 
 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
 inet 127.0.0.1/8 scope host lo
 valid_lft forever preferred_lft forever
 inet6 ::1/128 scope host 
 valid_lft forever preferred_lft forever
12: eth0: <BROADCAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
 link/ether 02:42:ac:11:00:03 brd ff:ff:ff:ff:ff:ff
 inet 172.17.0.3/16 scope global eth0
 valid_lft forever preferred_lft forever
 inet6 fe80::42:acff:fe11:3/64 scope link 
 valid_lft forever preferred_lft forever

Note that the eth0 interface is attached to the bridge the docker daemon created on the host. The docker security scheme means that, by default, nothing can reached that interface from the outside. However the docker may contact the outside world. Docker has an extensive documentation regarding its networking architecture.

Note that you can see containers statuses as follow:

$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
25454ad13219 centos:7 "bash" 4 minutes ago Up 4 minutes playground

Exit the container:

[root@7d5761d100e4 /]# exit

Run again the command:

$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES

As we can see the container is indeed gone. Let’s now rewind a little and do not tell docker to automatically remove the container when we exit it:

$ docker run --name playground -i -t centos:7 bash
[root@5960e4445743 /]# exit

Let’s see if the container is there:

$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES

Nope. So what’s different? Well, try again to start a container using that same name:

$ docker run --name playground -i -t centos:7 bash
2015/01/11 16:09:53 Error response from daemon: Conflict, The name playground is already assigned to 5960e4445743. You have to delete (or rename) that container to be able to assign playground to a container again.

Ooops. The container is actually still there:

$ docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
5960e4445743 centos:7 "bash" About a minute ago Exited (0) 57 seconds ago

There you go. By default docker ps doesn’t show you the containers in the exit status. You have to remove the container manually using its identifier:

$ docker rm 5960e4445743

I will not go further with using docker as it’s all you really need to start up with

A word about tags

Technically speaking, versions do not actually exist in docker images. They are in fact tags. A tag is a simple label for an image at a given point.

Images are identified with a hash value. As with IP addresses, you are not expected to recall the hash of the images you wish to use. Docker provides a mechanism to tag images much like you would use domain names instead of IP address.

For instance, 14.10 above is actually a tag, not a version. Obviously, since tags are meant to be meaningful to human beings, it’s quite sensible for Linux distributions to be tagged following the version of the distributions.

You can easily create tags for any images as we will see later on.

Let’s talk about registries

Docker images are hosted and served by a registry. Often as it’s the case in our previous example, the registry used is the public docker registry available at : https://registry.hub.docker.com/

Whenever you pull an image from a registry, by default docker pulls from that registry. However, you may query a different registry as follow:

$ docker pull hostname:port/path/to/image:tag

Basically, you provide the address of your registry and a path at which the image can be located. It has a similar form to an URI without the scheme.

Note that, as of docker 1.3.1, if the registry isn’t served over HTTPS, the docker client will refuse to download the image. If you need to pull anyway, you must add the following parameter to the docker daemon when it starts up.

 --insecure-registry hostname:port

Please refer to the official documentation to learn more about this.

A base Linux Python-ready container

Traditionnaly deploying CherryPy application has been done using a simple approach:

  • Package your application into an archive
  • Copy that archive onto a server
  • Configure a database server
  • Configure a reverse proxy such as nginx
  • Start the Python process(es) to server your CherryPy application

That last operation is usually done by directly calling nohup python mymodule.py &. Alternatively, CherryPy comes with a handy script to run your application in a slightly more convenient fashion:

$ cherryd -d -c path/to/application/conf/server.conf -P path/to/application -i mymodule

This runs the Python module mymodule as a daemon using the given configuration file. If the -P flag isn’t provided, the module must be found in PYTHONPATH.

The idea is to create an image that will serve your application using cherryd. Let’s see how to setup an Ubuntu image to run your application.

$ docker run --name playground -i -t ubuntu:14.10 bash
root@d91ec7935e33:/#

First we create a user which will not have the root permissions. This is a good attitude to follow:

root@d91ec7935e33:/# useradd -m -d /home/web web
root@d91ec7935e33:/# mkdir /home/web/.venv

Next, we install a bunch of libraries that are required to deploy some common Python dependencies:

root@d91ec7935e33:/# apt-get update
root@d91ec7935e33:/# apt-get upgrade -y
root@d91ec7935e33:/# apt-get install -y libc6 libc6-dev libpython2.7-dev libpq-dev libexpat1-dev libffi-dev libssl-dev python2.7-dev python-pip
root@d91ec7935e33:/# apt-get autoclean -y
root@d91ec7935e33:/# apt-get autoremove -y

Then we create a virtual environment and install Python packages into it:

root@d91ec7935e33:/# pip install virtualenv
root@d91ec7935e33:/# virtualenv -p python2.7 /home/web/.venv/default
root@d91ec7935e33:/# source /home/web/.venv/default/bin/activate
root@d91ec7935e33:/# pip install cython
root@d91ec7935e33:/# pip install cherrypy==3.6.0 pyopenssl mako psycopg2 python-memcached sqlalchemy

These are common packages I use. Install whichever you require obviously.

As indicated by Tony in the comments, it is probably overkill to create a virtual environment in a container since, the whole point of a container is to isolate your process and its dependencies already. I’m so used to using virtual env that I automatically created one. You may skip these steps.

Those operations were performed as the root user, let’s make the web user those packages owner.

root@d91ec7935e33:/# chown -R web.web /home/web/.venv

Good. Let’s switch to that user now:

root@d91ec7935e33:/# sudo su web
web@d91ec7935e33:/# cd /home/web

At this stage, we have a base image ready to support a CherryPy application. It might be interesting to tag that paricular container as a new image so that we can use it various contexts.

web@d91ec7935e33:/# docker ps
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS                    NAMES
675ad8e8752d        ubuntu:14.10        "bash"              7 minutes ago       Up 7 minutes        0.0.0.0:9090->8080/tcp   playground
web@d91ec7935e33:/# docker commit -m "Base Ubuntu with Python env" 675ad8e8752d
web@d91ec7935e33:/# docker images
REPOSITORY          TAG                 IMAGE ID            CREATED             VIRTUAL SIZE
                            78bc8c5c2e3f        6 seconds ago       506 MB
web@d91ec7935e33:/# docker tag 78bc8c5c2e3f lawouach/ubuntu:pythonbase
web@d91ec7935e33:/# docker images
REPOSITORY          TAG                 IMAGE ID            CREATED             VIRTUAL SIZE
lawouach/ubuntu     pythonbase          78bc8c5c2e3f        32 seconds ago      506 MB

We take the docker container and we commit it as a new image. We then tag the new created image to make it easy to reuse it later on.

Let’s see if it worked. Exit the container and start a new container from the new image.

$ docker run --name playground -i -t lawouach/ubuntu:pythonbase bash 
root@66c6b2a5bb08:/# sudo su web 
web@66c6b2a5bb08:/$ cd /home/web/ 
web@66c6b2a5bb08:~$ source .venv/default/bin/activate
(default)web@66c6b2a5bb08:~$

Well. We are ready to play now.

Run a CherryPy application in a docker container

For the purpose of this article, here is our simple application:

import cherrypy

class Root(object):
    @cherrypy.expose
    def index(self):
        return "Hello world"

cherrypy.config.update({'server.socket_host': '0.0.0.0'})
cherrypy.tree.mount(Root())

Two important points:

  • You must make sure CherryPy listens on the eth0 interface so just make it listen on all the container interfaces. Otherwise, the CherryPy will listen only on 127.0.0.1 which won’t be reachable from outside the container.
  • Do not start the CherryPy engine yourself, this is done by the cherryd command. You must simply ensure the application is mounted so that CherryPy can serve it.

Save this piece of code into your container under the module name: server.py. This could be any name, really. The module will be located in /home/web.

You can manually test the module:

$ docker run --name playground -p 9090:8080 -i -t lawouach/ubuntu:pythonbase bash
(default)web@66c6b2a5bb08:~$ cherryd -d -P /home/web -i server
(default)web@66c6b2a5bb08:~$ ip addr list eth0 | grep "inet "
inet 172.17.0.11/16 scope global eth0

The second line tells us the IPv4 address of this container. Next point your browser to the following URL: http://localhost:9090/

“What is this magic?” I hear you say!

If you look at the command we use to start the container, we provide this bit: -p 9090:8080. This tells docker to map port 9090 on the host to port 8080 on the container alllowing for your application to be reached from the outside.

And voilà!

Make the process a little more developer friendly

In the previous section, we saved the application’s code into the container itself. During development, this may not be practical. One approach is to use a volume to share a directory between your host (where you work) and the container.

$ docker run --name playground -p 9090:8080 -v `pwd`/webapp:/home/web/webapp -i -t lawouach/ubuntu:pythonbase bash

You can then work on your application and the container will see those changes immediatly.

Automate things a bit

The previous steps have shown in details how to setup an image to run a CherryPy application. Docker provides a simple interface to automate the whole process: Dockerfile.

A Dockerfile is a simple text file containing all the steps to create an image and more. Let’s see it first hand:

FROM ubuntu:14.10

RUN useradd -m -d /home/web web && mkdir /home/web/.venv &&\

apt-get update && sudo apt-get upgrade -y && \
apt-get install -y libc6 libc6-dev libpython2.7-dev libpq-dev libexpat1-dev libffi-dev libssl-dev python2.7-dev python-pip && \
pip install virtualenv && \
virtualenv -p python2.7 /home/web/.venv/default && \
/home/web/.venv/default/bin/pip install cython && \
/home/web/.venv/default/bin/pip install cherrypy==3.6.0 pyopenssl mako psycopg2 python-memcached sqlalchemy && \
apt-get autoclean -y && \
apt-get autoremove -y && \
chown -R web.web /home/web/.venv

USER web
WORKDIR /home/web
ENV PYTHONPATH /home/web/webapp

COPY webapp /home/web/webapp

ENTRYPOINT ["/home/web/.venv/default/bin/cherryd", "-i", "server"]

Create a directory and save the content above into a file named Dockerfile. Create a subdirectory called webapp and store your server.py module into it.

Now, build the image as follow:

$ docker build -t lawouach/mywebapp:latest .

Use whatever tag suits you. Then, you can run a container like this:

$ docker run -p 9090:8080 -i -t lawouach/mywebapp:latest

That’s it! A docker container running your CherryPy application.

In the next articles, I will explore various options to use docker in a web application context. Follow ups will also include an introduction to weave and coreos to clusterize your CherryPy application.

In the meantime, do enjoy.

“Robot Framework Test Automation” book review

From time to time PacktPub will request a book review of one of their Python-related titles. This time around it was regarding their “Robot Framework Test Automation” book they recently released. Since I’ve been using this awesome acceptance testing tool at work for more than two years, I was happy to comply.

In a nutshell, Robot Framework provides a great interface that acts as the middle-man between variour stakeholders. Indeed, tests are written in plain text (though other formats are supported, I never use them) with a rather minimal set of rules making it (almost) straightforward to read even by non-technical persons. The dirty technical details being hidden away and implemented in Python and executable in one of the various Python VM (CPython, Jython, IronPython are supported out of the box).

Most of the time, the basics of the Robot Framework data model and workflow can be taught in a couple of hours. However, being efficient with it will take a little more time. Still, people don’t have to learn a complete programming language (Python) itself and that’s a relief meaning they are happy to work with Robot Framework sometimes cumbersome syntax.

In spite of having a rather extensive documentation available online, the project did lack a good, straight to the point summary that takes you by the hand. Moreover, the documentation’s style of the project is fairly dry and Unix-style making it tedious to browse sometimes. Still, the content is there and it rarely failed me. With that said, having a friendly book on the subject is a great thing. Kudos to PacktPub. Now about the book…

The good

The book provides an introduction to the tool, its most common usages and even tries to guide you getting more from it. It’s a short book, 83 pages, that will not bore you with complex details. In other words, it’s a good companion of the online documentation if you start with Robot Framework.

Sumit Bisht, the author, does a good job keeping a neutral point of view in regards to how you should use Robot Framework. Indeed, depending on your software under test, you might want to have a more data-oriented approach (ala fitness), a behavior-driven testing approach or even a more assert-oriented style. Not many software can deal with all of them equally and it depends also on how testing is perceived in your organisation. Robot Framework can cope with all of them.

The bad

Though I could understand it’s only an introduction, it feels like some concepts are not properly explored. The idea behind keywords, the internal data model, dynamic libraries, etc. In other words, you will not really understand the underlying blocks and axioms that are the pedestal of the whole tool, you’ll rather learn the basics of using it. In fact, the only section where the book goes into more technical details (with a good example on using sikuli) will probably confuse you since it failed to properly introduce the principles behind them.

The ugly

There isn’t anything particulary that bad with this book, again it should be considered as a friendly introduction. I do not agree with a few minor points Sumit makes but they hardly matter and aren’t wrong anyway, just a matter of opinion. Note also that the book lacks examples a couple of times where it would have mattered but I don’t believe this makes the book any less useful.

The only thing that annoys me really is that PacktPub book’s layout still looks so unprofesionnal. They should really make an effort as the code is, most of the time, too hard to read (actually on this one item, it wasn’t that bad).

Final note

I think this book is ideal if you are about to start with Robot Framework as it will speed up the basics. If you’re already used to the tool, I am not sure it will help very much.

 

 

 

ws4py – WebSocket client and server library for Python

Recently I released ws4py, a package that provides client and server WebSocket support for Python 2.6 and 2.7.

Let’s first have a quick overview of what ws4py offers for now:

  • WebSocket specification draft-10 of the current specification.
  • A threaded client. This gives a simple client that doesn’t require an external dependency.
  • A Tornado client. This client is based on Tornado 2.0 which is quite a popular way of running asynchronous networking code these days. Tornado provides its own server implementation so I didn’t include mine in ws4py.
  • A CherryPy extension so that you can integrate WebSocket from within your CherryPy 3.2.1 server.
  • A gevent server based on the popular gevent library. This is courtesy of Jeff Lindsay.
  • Based on Jeff’s work, a pure WSGI middleware as well (available in the current master branch only until the next release).
  • ws4py runs on Android devices thanks to the SL4A package

Hopefully more client and servers will be added along the way as well as Python 3.x support. The former should be rather simple to add due to the way I designed ws4py.

The main idea is to make a distinction between the bytes provider and the bytes processing. The former is essentially reading and writing bytes from the connected socket. The latter is the function of making something out of the received bytes based on the WebSocket specification. In most implementations I have seen so far, both are rather heavily intertwined making it difficult to use a different bytes provider.

ws4py tries a different path by relying on a great feature of Python: the possibility to send data back to a generator. For instance, the frame parsing yields the quantity of bytes each time it needs more and the caller feeds back the generator those bytes once they are received. In fact, the caller of a frame parser is a stream object which acts the same way. The caller of that stream object is in fact the bytes provider (a client or a server). The stream is in charge of aggregating frames into a WebSocket message. Thanks to that design, both the frame and stream objects are totally unaware of the bytes provider and can be easily adapted in various contexts (gevent, tornado, CherryPy, etc.).

On my TODO list for ws4py:

  • Upgrade to a more recent version of the specification
  • Python 3.x implementation
  • Better documentation, read, write documentation.
  • Better performances on very large WebSocket messages

Running CherryPy on Android with SL4A

CherryPy runs on Android thanks to the SL4A project. So if you feel like running Python and your own web server on your Android device, well you can just do so. You’ve probably not heard something that awesome since the pizza delivery guy rung the door.

How to get on about it? Well that’s the surprise, CherryPy in itself doesn’t need to be patched. Granted I haven’t tried all the various tools provided by CherryPy but the server and the dispatching works just fine.

First, you need get the CherryPy source code, build and copy the resulting cherrypy package into the SL4A scripts directory.

Once you’ve plugged your phone to your machine through USB, run the next commands:

$ svn co http://svn.cherrypy.org/trunk cp3-trunk
$ cd cp3-trunk
$ python setup.py build
$ cp -r build/lib.linux-i686-2.6/cherrypy/ /media/usb0/sl4a/scripts/

Just change the path to match your environment. That’s it.

Now you can copy your own script, let’s assume you use something like below:

# -*- coding: utf-8 -*-
import logging
# The multiprocessing package isn't
# part of the ASE installation so
# we must disable multiprocessing logging
logging.logMultiprocessing = 0
 
import android
import cherrypy
 
class Root(object):
    def __init__(self):
        self.droid = android.Android()
 
    @cherrypy.expose
    def index(self):
        self.droid.vibrate()
        return "Hello from my phone"
 
    @cherrypy.expose
    def location(self):
        location = self.droid.getLastKnownLocation().result
        location = location.get('network', location.get('gps'))
        return "LAT: %s, LON: %s" % (location['latitude'],
                                     location['longitude'])
 
def run():
    cherrypy.config.update({'server.socket_host': '0.0.0.0'})
    cherrypy.quickstart(Root(), '/')
 
if __name__ == '__main__':
    run()

As you can see we must disable the multiprocessing logging since the multiprocessing package isn’t included with SL4A.

Save that script on your computer as cpdroid.py for example. Copy that file into the scripts directory of SL4A.

$ cp cpdroid.py /media/usb0/sl4a/scripts/

Unplug your phone and go to the SL4A application. Click on the cpdroid.py script, it should start fine. Then from your browser, go to http://phone_IP:8080/ and tada! You can also go to the /location path to get the geoloc of your phone.

Integrating SQLAlchemy into a CherryPy application

Quite often, people come on the CherryPy IRC channel asking about the way to use SQLAlchemy with CherryPy. There are a couple of good recipes on the tools wiki but I find them a little complex to begin with. Not to the recipes’ fault, many people don’t necessarily know about CherryPy tools and plugins at that stage.

The following recipe will try to make the example complete whilst as simple as possible to allow folks to start up with SQLAlchemy and CherryPy.

# -*- coding: utf-8 -*-
import os, os.path
 
import cherrypy
from cherrypy.process import wspbus, plugins
 
from sqlalchemy import create_engine
from sqlalchemy.orm import scoped_session, sessionmaker
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import Column
from sqlalchemy.types import String, Integer
 
# Helper to map and register a Python class a db table
Base = declarative_base()
 
class Message(Base):
    __tablename__ = 'message'
    id = Column(Integer, primary_key=True)
    value =  Column(String)
 
    def __init__(self, message):
        Base.__init__(self)
        self.value = message
 
    def __str__(self):
        return self.value.encode('utf-8')
 
    def __unicode__(self):
        return self.value
 
    @staticmethod
    def list(session):
        return session.query(Message).all()
 
 
class SAEnginePlugin(plugins.SimplePlugin):
    def __init__(self, bus):
        """
        The plugin is registered to the CherryPy engine and therefore
        is part of the bus (the engine *is* a bus) registery.
 
        We use this plugin to create the SA engine. At the same time,
        when the plugin starts we create the tables into the database
        using the mapped class of the global metadata.
 
        Finally we create a new 'bind' channel that the SA tool
        will use to map a session to the SA engine at request time.
        """
        plugins.SimplePlugin.__init__(self, bus)
        self.sa_engine = None
        self.bus.subscribe("bind", self.bind)
 
    def start(self):
        db_path = os.path.abspath(os.path.join(os.curdir, 'my.db'))
        self.sa_engine = create_engine('sqlite:///%s' % db_path, echo=True)
        Base.metadata.create_all(self.sa_engine)
 
    def stop(self):
        if self.sa_engine:
            self.sa_engine.dispose()
            self.sa_engine = None
 
    def bind(self, session):
        session.configure(bind=self.sa_engine)
 
class SATool(cherrypy.Tool):
    def __init__(self):
        """
        The SA tool is responsible for associating a SA session
        to the SA engine and attaching it to the current request.
        Since we are running in a multithreaded application,
        we use the scoped_session that will create a session
        on a per thread basis so that you don't worry about
        concurrency on the session object itself.
 
        This tools binds a session to the engine each time
        a requests starts and commits/rollbacks whenever
        the request terminates.
        """
        cherrypy.Tool.__init__(self, 'on_start_resource',
                               self.bind_session,
                               priority=20)
 
        self.session = scoped_session(sessionmaker(autoflush=True,
                                                  autocommit=False))
 
    def _setup(self):
        cherrypy.Tool._setup(self)
        cherrypy.request.hooks.attach('on_end_resource',
                                      self.commit_transaction,
                                      priority=80)
 
    def bind_session(self):
        cherrypy.engine.publish('bind', self.session)
        cherrypy.request.db = self.session
 
    def commit_transaction(self):
        cherrypy.request.db = None
        try:
            self.session.commit()
        except:
            self.session.rollback()  
            raise
        finally:
            self.session.remove()
 
 
 
 
class Root(object):
    @cherrypy.expose
    def index(self):
        # print all the recorded messages so far
        msgs = [str(msg) for msg in Message.list(cherrypy.request.db)]
        cherrypy.response.headers['content-type'] = 'text/plain'
        return "Here are your list of messages: %s" % '\n'.join(msgs)
 
    @cherrypy.expose
    def record(self, msg):
        # go to /record?msg=hello world to record a "hello world" message
        m = Message(msg)
        cherrypy.request.db.add(m)
        cherrypy.response.headers['content-type'] = 'text/plain'
        return "Recorded: %s" % m
 
if __name__ == '__main__':
    SAEnginePlugin(cherrypy.engine).subscribe()
    cherrypy.tools.db = SATool()
    cherrypy.tree.mount(Root(), '/', {'/': {'tools.db.on': True}})
    cherrypy.engine.start()
    cherrypy.engine.block()

The general idea is to use the plugin mechanism to register functions on an engine basis and enable a tool that will provide an access to the SQLAlchemy session at request time.

Using Jython as a CLI frontend to HBase

HBase, the well known non-relational distributed database, comes with a console program to perform various operations on a HBase cluster. I’ve personally found this tool to be a bit limited and I’ve toyed around the idea of writing my own. Since HBase only comes with a Java driver for direct access and the various RPC interfaces such as Thrift don’t offer the full set of functions over HBase, I decided to go for Jython and to directly use the Java API. This article will show a mock-up of such a tool.

The idea is to provide a simple Python API over the HBase one and couple it with a Python interpreter. This means, it offers the possibility to perform any Python (well Jython) operations whilst operating on HBase itself with an easier API than the Java one.

Note also that the tool uses the WSPBus already described in an earlier article to control the process itself. You will therefore need CherryPy’s latest revision.

# -*- coding: utf-8 -*-
import sys
import os
import code
import readline
import rlcompleter
 
from org.apache.hadoop.hbase import HBaseConfiguration, \
     HTableDescriptor, HColumnDescriptor
from org.apache.hadoop.hbase.client import HBaseAdmin, \
     HTable, Put, Get, Scan
 
import logging
from logging import handlers
 
from cherrypy.process import wspbus
from cherrypy.process import plugins
 
class StaveBus(wspbus.Bus):
    def __init__(self):
        wspbus.Bus.__init__(self)
        self.open_logger()
        self.subscribe("log", self._log)
 
        sig = plugins.SignalHandler(self)
        if sys.platform[:4] == 'java':
            del sig.handlers['SIGUSR1']
            sig.handlers['SIGUSR2'] = self.graceful
            self.log("SIGUSR1 cannot be set on the JVM platform. Using SIGUSR2 instead.")
 
            # See http://bugs.jython.org/issue1313
            sig.handlers['SIGINT'] = self._jython_handle_SIGINT
        sig.subscribe()
 
    def exit(self):
        wspbus.Bus.exit(self)
        self.close_logger()
 
    def open_logger(self, name=""):
        logger = logging.getLogger(name)
        logger.setLevel(logging.INFO)
        h = logging.StreamHandler(sys.stdout)
        h.setLevel(logging.INFO)
        h.setFormatter(logging.Formatter("[%(asctime)s] %(name)s - %(levelname)s - %(message)s"))
        logger.addHandler(h)
 
        self.logger = logger
 
    def close_logger(self):
        for handler in self.logger.handlers:
            handler.flush()
            handler.close()
 
    def _log(self, msg="", level=logging.INFO):
        self.logger.log(level, msg)
 
    def _jython_handle_SIGINT(self, signum=None, frame=None):
        # See http://bugs.jython.org/issue1313
        self.log('Keyboard Interrupt: shutting down bus')
        self.exit()
 
class HbaseConsolePlugin(plugins.SimplePlugin):
    def __init__(self, bus):
        plugins.SimplePlugin.__init__(self, bus)
        self.console = HbaseConsole()
 
    def start(self):
        self.console.setup()
        self.console.run()
 
class HbaseConsole(object):
    def __init__(self):
        # we provide this instance to the underlying interpreter
        # as the interface to operate on HBase
        self.namespace = {'c': HbaseCommand()}
 
    def setup(self):
        readline.set_completer(rlcompleter.Completer(self.namespace).complete)
        readline.parse_and_bind("tab:complete")
        import user
 
    def run(self):
        code.interact(local=self.namespace)
 
class HbaseCommand(object):
    def __init__(self, conf=None, admin=None):
        self.conf = conf
        if not conf:
            self.conf = HBaseConfiguration()
        self.admin = admin
        if not admin:
            self.admin = HBaseAdmin(self.conf)
 
    def table(self, name):
        return HTableCommand(name, self.conf, self.admin)
 
    def list_tables(self):
        return self.admin.listTables().tolist()
 
class HTableCommand(object):
    def __init__(self, name, conf, admin):
        self.conf = conf
        self.admin = admin
        self.name = name
        self._table = None
 
    def row(self, name):
        if not self._table:
            self._table = HTable(self.conf, self.name)
        return HRowCommand(self._table, name)
 
    def create(self, families=None):
        desc = HTableDescriptor(self.name)
        if families:
            for family in families:
                desc.addFamily(HColumnDescriptor(family))
        self.admin.createTable(desc)
        self._table = HTable(self.conf, self.name)
        return self._table
 
    def scan(self, start_row=None, end_row=None, filter=None):
        if not self._table:
            self._table = HTable(self.conf, self.name)
 
        sc = None
        if start_row and filter:
            sc = Scan(start_row, filter)
        elif start_row and end_row:
            sc = Scan(start_row, end_row)
        elif start_row:
            sc = Scan(start_row)
        else:
            sc = Scan()
        s = self._table.getScanner(sc)
        while True:
            r = s.next()
            if r is None:
                raise StopIteration()
 
            yield r
 
    def delete(self):
        self.disable()
        self.admin.deleteTable(self.name)
 
    def disable(self):
        self.admin.disableTable(self.name)
 
    def enable(self):
        self.admin.enableTable(self.name)
 
    def exists(self):
        return self.admin.tableExists(self.name)
 
    def list_families(self):
        desc = HTableDescriptor(self.name)
        return desc.getColumnFamilies()
 
class HRowCommand(object):
    def __init__(self, table, rowname):
        self.table = table
        self.rowname = rowname
 
    def put(self, family, column, value):
        p = Put(self.rowname)
        p.add(family, column, value)
        self.table.put(p)
 
    def get(self, family, column):
        r = self.table.get(Get(self.rowname))
        v = r.getValue(family, column)
        if v is not None:
            return v.tostring()
 
 
if __name__ == '__main__':
    bus = StaveBus()
    HbaseConsolePlugin(bus).subscribe()
    bus.start()
    bus.block()

To test the tool, you can simply grab the latest copy of HBase and run:

hbase-0.20.4$ ./bin/start-hbase.sh

Then you need to configure your classpath so that it includes all the HBase dependencies. To determine them:

$ ps auwx|grep java|grep org.apache.hadoop.hbase.master.HMaster|perl -pi -e "s/.*classpath //"

Copy the full list of jars and export CLASSPATH with it. (This is from the HBase wiki on Jython and HBase).

Next you have to add an extra jar to the classpath so that Jython supports readline:

$ export CLASSPATH=$CLASSPATH:$HOME/jython2.5.1/extlibs/libreadline-java-0.8.jar

Make sure you’ll install libreadline-java as well.

Now, that your environment is setup, save the code above under a script named stave.py and run it as follow:

$ jython stave.py
Python 2.5.1 (Release_2_5_1:6813, Sep 26 2009, 13:47:54) 
[Java HotSpot(TM) Server VM (Sun Microsystems Inc.)] on java1.6.0_20
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>> c.table('myTable').create(families=['aFamily:'])
>>> c.table('myTable').list_families()
array(org.apache.hadoop.hbase.HColumnDescriptor)
>>> c.table('myTable').row('aRow').put('aFamily', 'aColumn', 'hello world!')
>>> c.table('myTable').row('aRow').get('aFamily', 'aColumn')
'hello world!'
>>> list(c.table('myTable').scan())
[keyvalues={aRow/aFamily:aColumn/1277645421824/Put/vlen=12}]

You can import any Python module available to your Jython environment as well of course.

I will probably extend this tool over time but in the meantime I hope you’ll find it a useful canvas to operate HBase.

A quick chat WebSockets/AMQP client

In my previous article I described how to plug WebSockets into AMQP using Tornado and pika. As a follow-up, I’ll show you how this can be used to write the simplest chat client.

First we create a web handler for Tornado that will return a web page containing the Javascript code that will connect and converse with our WebSockets endpoint following the WebSockets API.

class MainHandler(tornado.web.RequestHandler):
    def get(self):
        username = "User%d" % random.randint(0, 100)
        self.write("""<html>
        <head>
          <script type='application/javascript' src='/static/jquery-1.4.2.min.js'> </script>
          <script type='application/javascript'>
            $(document).ready(function() {
              var ws = new WebSocket('ws://localdomain.dom:8888/ws');
              ws.onmessage = function (evt) {
                 $('#chat').val($('#chat').val() + evt.data + '\\n');                  
              };
              $('#chatform').submit(function() {
                 ws.send('%(username)s: ' + $('#message').val());
                 $('#message').val("");
                 return false;
              });
            });
          </script>
        </head>
        <body>
        <form action='/ws' id='chatform' method='post'>
          <textarea id='chat' cols='35' rows='10'></textarea>
          <br />
          <label for='message'>%(username)s: </label><input type='text' id='message' />
          <input type='submit' value='Send' />
          </form>
        </body>
        </html>
        """ % {'username': username})

Every time, the user enters a message and submits it too our WebSockets endpoint which, in return, will forward any messages back to the client. These will be appended to the textarea.

Internally, each client gets notified of any message through AMQP and the bus. Indeed the WebSockets handler are subscribed to a channel that will be notified every time the AMQP server pushes data to the consumer. A side effect of this is that the Javascript code above doesn’t update the textarea when it sends the message the user has entered, but when the server sends it back.

Let’s see how we had to change the Tornado application to support that handler as well as the serving of jQuery as a static resource (you need the jQuery toolkit in the same directory as the Python module).

 
if __name__ == '__main__':
    application = tornado.web.Application([
        (r"/", MainHandler),
        (r"/ws", WebSocket2AMQP),
        ], static_path=".", bus=bus)
 
    http_server = tornado.httpserver.HTTPServer(application)
    http_server.listen(8888)
 
    bus.subscribe("main", poll)
    WS2AMQPPlugin(bus).subscribe()
    bus.start()
    bus.block()

The code is here.

Once the server is running, open two browser windows and access http://localhost:8888/. You should be able to type messages in one and see them appears in both windows.

Note:

This has been tested against the latest Chrome release. You will need to either set the “localdomain.dom” or provide the IP address of your network interface in the Javascript above since Chrome doesn’t allow for localhost nor 127.0.0.1.

Plugging AMQP and WebSockets

In my last article, I discussed the way the WSPBus could help your management of Python processes. This time, I’ll show how the bus can help plugging in heterogeneous frameworks and manage them properly too.

The following example will plug the WebSockets and AMQP together in order to channel data in and out of a WebSockets channel into AMQP exchanges and queues. For this, we’ll be using the Tornado web framework to handle the WebSockets side and pika for the AMQP one.

pika uses the Python built-in asyncore module to perform the non-blocking socket operations whilst Tornado comes with its own main loop on top of select or poll. Since Tornado doesn’t offer a single function call to iterate once, we’ll be directly using their main loop to block the process and therefore won’t be using the bus’ own block method.

Let’s see how the bus looks like

 class MyBus(wspbus.Bus):
    def __init__(self, name=""):
        wspbus.Bus.__init__(self)
        self.open_logger(name)
        self.subscribe("log", self._log)
 
        self.ioloop = tornado.ioloop.IOLoop.instance()
        self.ioloop.add_callback(self.call_main)
 
    def call_main(self):
        self.publish('main')
        time.sleep(0.1)
        self.ioloop.add_callback(self.call_main)
 
    def block(self):
        ioloop = tornado.ioloop.IOLoop.instance()
        try:
            ioloop.start()
        except KeyboardInterrupt:
            ioloop.stop()
            self.exit()
 
    def exit(self):
        wspbus.Bus.exit(self)
        self.close_logger()
 
    def open_logger(self, name=""):
        logger = logging.getLogger(name)
        logger.setLevel(logging.INFO)
        h = logging.StreamHandler(sys.stdout)
        h.setLevel(logging.INFO)
        h.setFormatter(logging.Formatter("[%(asctime)s] %(name)s - %(levelname)s - %(message)s"))
        logger.addHandler(h)
 
        self.logger = logger
 
    def close_logger(self):
        for handler in self.logger.handlers:
            handler.flush()
            handler.close()
 
    def _log(self, msg="", level=logging.INFO):
        self.logger.log(level, msg)

Next we create a plugin that will subscribe to the bus and which will be in charge for the AMQP communication.

class WS2AMQPPlugin(plugins.SimplePlugin):
    def __init__(self, bus):
        plugins.SimplePlugin.__init__(self, bus)
        self.conn = pika.AsyncoreConnection(pika.ConnectionParameters('localhost'))
        self.channel = self.conn.channel()
        self.channel.exchange_declare(exchange="X", type="direct", durable=False)
        self.channel.queue_declare(queue="Q", durable=False, exclusive=False)
        self.channel.queue_bind(queue="Q", exchange="X", routing_key="")
 
        self.channel.basic_consume(self.amqp2ws, queue="Q")
 
        self.bus.subscribe("ws2amqp", self.ws2amqp)
        self.bus.subscribe("stop", self.cleanup)
 
    def cleanup(self):
        self.bus.unsubscribe("ws2amqp", self.ws2amqp)
        self.bus.unsubscribe("stop", self.cleanup)
        self.channel.queue_delete(queue="Q")
        self.channel.exchange_delete(exchange="X")
        self.conn.close()
 
    def amqp2ws(self, ch, method, header, body):
        self.bus.publish("amqp2ws", body)
        ch.basic_ack(delivery_tag=method.delivery_tag)
 
    def ws2amqp(self, message):
        self.bus.log("Publishing to AMQP: %s" % message)
        self.channel.basic_publish(exchange="X", routing_key="", body=message)

The interesting bits are the amqp2ws and ws2amqp methods. The former is called anytime the AMQP broker pushes data to our AMQP consumer, we then use the bus to publish the message to any interested subscribers. The latter publishes to AMQP messages that come from the WebSockets channel.

Next let’s see the Tornado WebSockets handler.

class WebSocket2AMQP(websocket.WebSocketHandler):
    def __init__(self, *args, **kwargs):
        websocket.WebSocketHandler.__init__(self, *args, **kwargs)
        self.settings['bus'].subscribe("amqp2ws", self.push_message)
 
    def open(self):
        self.receive_message(self.on_message)
 
    def on_message(self, message):
        self.settings['bus'].publish("ws2amqp", message)
        self.write_message(message)
        self.receive_message(self.on_message)
 
    def on_connection_close(self):
        self.settings['bus'].unsubscribe("amqp2ws", self.push_message)
 
    def push_message(self, message):
        self.write_message(message)

The on_message method is called whenever data is received from the client, the push_message is used to push data to the client.

Finally, we setup the plug everything together:

if __name__ == '__main__':
    application = tornado.web.Application([
        (r"/ws", WebSocket2AMQP),
        ], bus=bus)
 
    http_server = tornado.httpserver.HTTPServer(application)
    http_server.listen(8888)
 
    bus.subscribe("main", poll)
    WS2AMQPPlugin(bus).subscribe()
    bus.start()
    bus.block()

Notice the fact we subscribe the asyncore poll function to the main channel of the bus so that pika works properly as if we had called asyncore.loop()

The code can be found here.