amplee documentation

Table of content

Overview

Atom Publishing Protocol

The Atom Publishing Protocol (APP or AtomPub) is a community initiative to create an application protocol based on HTTP and the Atom 1.0 XML format. APP is meant to provide a standardized way to publish and edit resources on the web. APP has a terminology on its own:

APP defines two distinct member types:

Operations on APP are realized mostly against collections and members. A member is attached to a collection by POSTing it to the URI indicated by the collection. Then each resource can be retrieved, updated and deleted via the HTTP equivalent methods.

amplee

amplee is a Python implementation of APP. The aim is to provide:

The idea of separating both elements is that one can easily implement a specific use case without too much complexity.

amplee API tour

The amplee API is not big but can be intimidating when you don't know the APP architecture. However with a little practice you will see that the API is not that difficult and that most steps are always the same.

amplee.atompub package

The amplee.atompub contains the implementation in a 1-to-1 fashion of the APP specification. This package allows you to manipulate APP entities from any Python application o from the command line independantly from the HTTP protocol itself. This can be handy has it means you can work on the APP store through code rather than a network protocol. Let's see the modules defined by this package:

amplee.atompub.member package

As we have said in the APP overview, APP supports any media type as a resource with a specific treatment for Atom entry documents. amplee tries hard to reflect this plurality by providing a set of built-in members covering common use cases. Let's review those:

As amplee grows more built-in members will be added so that it will be easy to handle a very large set of media types.

amplee.storage package

Considering the fact the idea of APP is to provide a protocol to publish and manipulate web resources it is obvious that some kind of persistence is expected from APP implementation. Since the beginning amplee has tried to provide a very simplistic storage interface that could easily be implemented upon different persistence mechanism. Therefore amplee supports:

Note that a storage will provide an API to distinguish between content and meta-data (which normally will be an Atom entry describing the media resource if any). This distinction is quite loose however and have almost no impact at all on what is actually done internally.

amplee.handler package

Now that we have APP entities and a storage to persist resources we need an HTTP handler that will allow us to setup the public interface to access the APP store.

amplee supports currently two main Python interfaces to work over HTTP:

Considering the fact CherryPy can at the same time hosts any WSGI application as well as act as a WSGI application it could have been useless to provide both. However many developers won't want to install CherryPy if they don't use it anywhere else. That's why amplee comes with a pure WSGI interface independant from any framework or library.

HOWTO and FAQ

In this section we will review common tasks you will be led to achieve when using amplee. Hopefully these tips will help you getting on full speed with its API.

Create a filesystem storage

from amplee.storage.dummyfs import DummyStorageFS
storage = DummyStorageFS('/absolute/path/to/repo/directory', enable_lock=False|True)

The enable_lock parameter informs amplee if it should instanciate a thread lock for operations modifying the state of the resource within the storage. By default it is False as thre will be little reason to lock at this stage.

Create a subversion storage

from amplee.storage.storesvn import SubversionStorage
storage = SubversionStorage(uri_to_repo, '/absolute/path/to/working/copy', [username[, [password[)

The subversion storage expects at least the first two parameters to be provided. The first one is an URI to the repository. For instance: http://svn.myhost.com/trunk or file:///absolute/path/to/repo. The second parameter is the absolute path of the working copy on the local machine running the application. Note that both the repository and the working copy must exist prior to running that code. For example:

# create an empty repo
svnadmin create /absolute/path/to/repo
# create a working copy of it
svn co file:///absolute/path/to/repo

Create a ZODB storage

from ZODB import FileStorage, DB
from amplee.storage.storezodb import ZODBStorage

db = DB(FileStorage.FileStorage('/absolute/path/to/db.fs')
storage = ZODBStorage(db, 'some_text')

The ZODB storage (tested against ZODB 3.6) expects a DB instance. In our example we use a FileStorage but we could of course use ZEO in place through a ClientStorage. the second parameter expect a Python string indicating the name of the top-level node within the ZODB instance we will be running. If this place holder does not already exist it will be created.

Create a dejavu storage

from amplee.storage.storedejavu import DejavuStorage
conf = {'Connect': 'host=localhost dbname=amplee_test user=test password=test'}
storage = DejavuStorage('dejavu.storage.storepypgsql.StorageManagerPgSQL', conf)

The dejavu storage expects two parameters. The first one is a string indicating which underlying engine dejavu will be using. The second is settings to provide to that engine. Please review the dejavu documentation to use other databases.

Create an Amazon S3 storage

from amplee.storage.stores3 import S3Storage
storage = S3Storage(aws_access_key_id, aws_secret_access_key, unique_prefix)

This storage expects three parameters. The first one is your public key to access the Amazon service. The second is the private key to authenticate you. The last one is a prefix that will be used transparently by amplee when creating new buckets on the servers. Make it as unique as possible even, it is not meant to be humanly readable anyway but will avoid the possibility of conflicts of existing buckets belonging to someone else.

Create a new storage type

It may happen that you require a specific storage type. This is fairly easy. Inherit your class from amplee.storage.Storage and implement all the methods as need be. You can look at the built-in storages.

Create a new container

amplee storages allow you to create new containers.

For example:

from amplee.storage.dummyfs import DummyStorageFS
storage = DummyStorageFS('/absolute/path/to/repo/directory')
container = storage.create_container('music')

Note that if the container already exists it will be directly returned without being overwritten. Note also that each storage type will result in a specific action. For instance with a dejavu storage it will create a new table within the database with the name of table matching the container name provided. In a ZODB storage this will create a new node in the tree which will be an instance of the OOBTree class.

List all resources

If you work on a amplee.storage.Storage object:

storage.ls(container_name, ext='')

If you work on a amplee.atompub.store.AtomPubStore object:

store.list_members(container_name, ext='')

This will return a dictionnary of the form:

{resource_name: {'path': full_resource_path}

Note that the full_resource_path will be dependant on the storage nature. So for a file system or subversion storage it will be the absolute path to the file, on a s3 storage it will be an instance of the Key class from the boto package, on a ZODB package it will be a list, etc. In any case the returned value is suitable for the other storage methods.

List resources by their extension

If you work on a amplee.storage.Storage object:

storage.ls(container_name, ext='atom')

If you work on a amplee.atompub.store.AtomPubStore object:

store.list_members(container_name, ext='atom')

This will return a dictionnary containing only the resources having the provided extension.

List all resources not having an extension

If you work on a amplee.storage.Storage object:

storage.ls(container_name, ext='atom', distinct=True)

If you work on a amplee.atompub.store.AtomPubStore object:

store.list_members(container_name, ext='atom', distinct=True)

This will return a dictionnary containing only the resources which don't have the provided extension.

Create a basic store

Use the build_basic_store, a helper function to generate a store with one collection.

from amplee.handler import build_basic_store

collection, types = build_basic_store(member_storage, media_storage, workspace_title=u'Some cool workspace', 
          collection_title=u'Some even more cool application', 
          base_uri='http://localhost:8080', base_edit_uri='http://localhost:8080/manage', 
          accept_media_types=[u'entry', 'application/ogg'])

In this example we generate a store based on a member and media storages. The returned values are the collection instance as well as a dictionary of the form:

{media-type: MemberType instance}

The amplee.handler.MemberType class wraps parameters that are associated with a given media type. For instance what member class amplee should use when processing content tagged with such media type. For instance something such as:

from amplee.atompub.member import atom, audio
types =  {'application/atom+xml': MemberType('application/atom+xml', atom.AtomMember),
      'application/ogg': MemberType('application/ogg', audio.OGGMember)}

As an example you could then pass those values to a handler like this:

# see http://lukearno.com/projects/selector/
import selector
from wsgiref.simple_server import make_server
from amplee.handler.store.wsgi import Store

store = Store(collection, types)
s = selector.Selector()
s.add('[/]', POST=store.create_member, GET=store.get_collection, HEAD=store.head_collection)
s.add('/{rid:any}', GET=store.get_member, PUT=store.update_member, DELETE=store.delete_member, HEAD=store.head_member)

httpd = make_server('localhost', 8080, s)
httpd.serve_forever()

Provide your own IRI generator

When amplee generates an Atom entry as a member of the collection it sets automatically the following links element:

The second link is only added when dealing with media resources. The first link has an href defined as the IRI to the member resource. The second link has the href set to the media resource IRI.

The construction of this IRI is based on the base URI value provided to which is appended a string representing the resource in a meaningful way. For instance say you have a OGG/Vorbis song called Moonlight under the sun (yeah poetry...). amplee will generate the following IRI for the edit-media link (if base URI is http://myhost.net/music):

<atom:link rel="edit-media" href="http://myhost.net/music/Moonlight-under-the-sun.ogg" type="application/ogg" />

Note that the extension is automatically added but this can be avoided within the parameters of the audio.OGGMember class.

However this might not suit your requirement and you'd rather generate the last segment of the URI yourself. To do so you must provide a Python callback to the create_basic_store method as follow:

build_basic_store(..., iri_name_creator=my_callable)

The callable signature is as follow:

def my_callable(slug, title=None):
    return unicode_object

The slug parameter will be the original value provided by the user-agent in the Slug HTTP header. The title parameter is the value that has been computed so far by amplee. This value depends on the member class. For instance in the case of the OGGMember class it will be the generation of a value based on the ID3 meta tags. In some cases this value will be None because amplee has no information to generate one.

In any case when this callable is provided, amplee will call it to generate the last segment of the member or media resource IRI.

Note that the callable is also provided via the MemberType class parameters as follow:

 params = {'name_creator': callable}
 types =  {'application/ogg': MemberType('application/ogg', audio.OGGMember, params=params)}

Provide your own Atom id generator

By default amplee generates a value for the atom:id element of a member resource. The id generated is based on UUID through the uuid module (which is standard in Python 2.5). If you prefer to provide your own id generator you may do so by setting the entry_id_creator parameter of the member class. For example:

For the atom.AtomMember class this would be:

 def id_generator(base_uri, seed):
 return unicode object based on base_uri and seed

The first parameter is the base URI of the collection and is always provided. The seed parameter would be provided by amplee and would be a bridge.Element instance of the Atom entry document carried within the HTTP request. You could for example extract the title from the Atom entry, format it to make it valid within an URI and append it to the base_uri.

For the audio.OGGMember class it would look like:

 def id_generator(base_uri, source, artist, album, tracknumber, title, date, genres):
 return unicode object

Again this provides the base URI of the collection, the content from the request and meta-data extracted from the source itself.

In both cases you would then do the followin:

 params = {'entry_id_creator': id_generator}
 types =  {'application/ogg': MemberType('application/ogg', audio.OGGMember, params=params)}

Common way to create a collection

The amplee.atompub.collection.AtomPubCollection class is the pivot of amplee as it is the main interface to manipulate member and media resources. Therefore this class takes quite a few parameters and look intimidating.

Here is the most common usage of that class:

from amplee.atompub.collection import AtomPubCollection
from bridge import Element
from bridge.common import ATOM10_PREFIX, ATOM10_NS

categories = [Element(u'category', attributes={u'term': u'bloggyblogga'},
                      prefix=ATOM10_PREFIX, namespace=ATOM10_NS)]
collection = AtomPubCollection(workspace, name_or_id="some_unique_and_private_value",
                               title=u"Yeah a cool list of things", 
                               xml_attrs={'base': u"http://somehost.net", lang=u'en'},
                               base_uri=u"", base_edit_uri=u"edit", 
                               accept_media_types=[u'entry'], categories=categories)

In this example we:

Make a collection the preferred one

APP allows you to mark one collection as the favorite one of the workspace it belongs. To do so the collection element must be the first child of the workspace element in the XML representation of the workspace. amplee let you define which collection is the favorite by setting its favorite attribute to True. You can pass it in the AtomPubCollection constructor as well.

Set the `fixed` attributes of the categories element

APP provides the mean to the server to refuse incoming Atom entries based on their category elements. To do this you can set the fixed_categories element of the collection instance to True. When doing so amplee will reject any POSTed entry that does not have at least one entry element matching those provided to the collection. The comparison is currently made on the term attribute only.

Feed of a collection

You may want to provide the Atom feed representation of your collection, if so you can do it like this:

collection.feed.xml()

The feed attribute of the collection instance is a bridge.Element instance.

Reload all members from the storage

amplee keeps a reference to all the members of a collection in memory in order to speed up some of its lookup. When you stop a process running amplee you loose this cache. It is often useful to be able to refill it when needed:

collection.reload_members()

Add a member to a collection

Once you have a collection you want to add new members or update existing ones. Do it as follow:

collection.attach(member, member_content)

This will add a member and its content. If you have a media resource:

collection.attach(member, member_content, media_content)

The attach methods allow more options but they should be rarely used.

To make the modification effective into the storage you must then call:

collection.store.commit()

Delete a member from a collection

You can remove a member from the collection as follow:

collection.prune(member_id, media_id)
collection.store.commit()

Retrieve media resource id from its member id (vice versa)

Sometimes you have the member id and wish to know the media id, you can do so as follow:

member_id, media_id = collection.convert_id(member_id=some_value)

The opposite is also doable:

member_id, media_id = collection.convert_id(media_id=some_value)