onsdag den 6. februar 2008

Python eggs - a Simple Introduction

Python eggs used to be the wave of the future. But for Zope and Plone developers this has evolved into a true tsunami. They are everywhere now.

Yet there is a lot of confusion of what they are and how to use them.

To understand them, you need to understand Pythons way of organizing code files.


Module



The basic unit of code reusability in Python: a block of code imported by some other code. It is most often a module written in Python and contained in a single .py file. Also called a script.


hello.py


Let us say that this hello.py contains a function:


def helloworld():
print 'Hello World'


Then it is possible to import that function like this:


from hello import helloworld


Package



A module that contains other modules; typically contained in a directory in the filesystem and distinguished from other directories by the presence of a file __init__.py.

A step up from a script is a module, which is a library with an __init__.py file in it.


hello/
__init__.py


You can then put the helloworld function into the __init__.py script, and import it like you did before:


from hello import helloworld


You could also keep it in the hello.py file from before.


hello/
__init__.py
hello.py


But then you must import it like this:


from hello.hello import helloworld


Unless you import it into the module namespace. You do this in the __init__.py script:


from hello import helloworld


Then you will once more be able to write:


from hello import helloworld


This ensures the you can reorganize your code and still remain backwards compatibility.

You can have modules inside modules. A python library is just a module, or a structure of modules.

A structure of modules is called a package.

Distutils



So far it has all been about writing and organizing Python code. But the next step is distribution af said code. First step in this direction is distutils.

Distutils was written to have a single unified way to install Python modules and packages. Basically you just cd to the directory of the module and write:


python setup.py install


Then the module will automagically install itself in the python it was enwoked with.


Distutils defines a directory/file structure outside your module, that has nothing to do with the module per se, but is used distribute the module.

If you want to make a distribution of the hello module you must put it inside a directory that also contains a setup.py file.


somedir/
setup.py
hello/
__init__.py
hello.py


The setup.py could contains this code, that runs the setup function:


from distutils.core import setup

setup(name='hello',
version='1.0',
packages=['hello',],
)


You then run the code like this:

python setup.py sdist

And it will create a new directory structure like this:


somedir/
setup.py
hello/
__init__.py
hello.py
dist/
hello-1.0.tar.gz


The hello-1.0.tar.gz then contains the package distribution. It has this structure when unpacked:


hello-1.0/
PKG-INFO
setup.py
hello/
__init__.py
hello.py


The hello package is inside it. It is just a copy of your own package with no changes.

setup.py is there too. It is also just a copy of the one you wrote to create the package with. The clever thing about distutils is that it can use the same script to create the distribution as it use to install the package.

PKG-INFO is a new file and it just contains some metadata for the package. Those can be set in the setup.py.

Setuptools



Setuptools is built on top of distutils. It makes it possible to save modules in pypi, or somewhere else. It uses eggs for distribution.

eggs



An egg is created very much like a distutil package. You just have to change a line in your setup.py


from setuptools import setup # this is new

setup(name='hello',
version='1.0',
packages=['hello',],
)


Then you call it with:


python setup.py bdist_egg


And you get a new file in your dist directory:


dist/
hello-1.0-py2.4.egg


This is the egg that you can put on your website, or even better, publish to pypi. you can get an account on pypi, and then you will be able to add your eggs via the command line like:


setup.py bdist_egg upload


Easy Install



When you have uploaded your egg, all the world is able to use it by installing it with easy_install:


easy_install hello


Easy install will then find the egg on pypi, download it, compile if necessary and add it to your sys.path so that Python will find it.

Buildout



Buildout is a configuration based system for making complicated but repeatable setups for large systems.

Phew. That sounds complicated. Well buildout can be. But what is interresting from an eggs based point of view is that you configure what eggs are to be installed in your system.

Inside your buildout.cfg you can have a line like:


eggs =
hello


Then buildout will automatically download and install the hello package in your system.

Buildouts can themself be distributed as eggs, and you can extend a buildout to add new packages. This is how you can install a Plone buildout and then add your own packages to it. Basically creating your own custom Plone distributions.

Resources




Python modules and packages

http://www.python.org/doc/2.4.4/tut/node8.html

Distutils

http://docs.python.org/lib/module-distutils.html

Setuptools

http://peak.telecommunity.com/DevCenter/setuptools

Eggs

http://peak.telecommunity.com/DevCenter/PythonEggs

Buildout

http://pypi.python.org/pypi/zc.buildout