.. currentmodule:: playdoh

.. _playdoh:

##################
Playdoh User Guide
##################

Playdoh is a pure Python library for distributing 
computations across the free computing units (CPUs and GPUs) available in a small network 
of multicore computers. Playdoh 
supports independent (embarassingly) parallel problems as well as loosely coupled 
tasks such as global optimizations, Monte Carlo simulations and numerical integration of 
partial differential equations. It is designed to be lightweight and easy to use and 
should be of interest to scientists wanting to turn their lab computers into a small 
cluster at no cost.

This user guide is an introduction to Playdoh. It shows how to distribute
independent parallel tasks, how to distribute optimizations, and how to write loosely coupled parallel
tasks.

.. seealso:: Reference documentation :ref:`reference`.

===========
Quick Start
===========

Installation
------------

First of all, you should have `Python 2.6 <http://www.python.org/>`__ installed on 
your machine with `Numpy 1.3 <http://numpy.scipy.org/>`__ at least
(Playdoh is currently not available for Python 3). 
Then, go to the `download page`_ and download the archive or the executable if you're on Windows. 
Finally, install the package with the Windows executable or with the following command::

	python setup.py install
	
.. _download page: http://code.google.com/p/playdoh/downloads/list

You can also run in a console::

	easy_install playdoh

The installation script requires the Python package `setuptools <http://pypi.python.org/pypi/setuptools>`__
so that the command-line tool included in Playdoh can be automatically installed. The `setuptools` package
should be automatically installed when you install Playdoh.
The installation script automatically installs the following tools: ``playdoh`` and ``playdoh_gui``,
which are a command-line tool and a GUI, respectively. 

All scripts using Playdoh should start by importing the Playdoh package as follows::

	from playdoh import *


.. _guide_independent:

Independent parallel problems
-----------------------------

Playdoh offers a parallel and distributed implementation of the :func:`map` function
to quickly evaluate a single Python function against several sets of parameters,
across several CPUs and machines.

The following example shows how to distribute the function ``y=x**2`` across two CPUs::

	from playdoh import *
	
	# The function to parallelize
	def fun(x):
	    return x**2
	
	# This line is required on Windows, any call to a Playdoh function
	# must be done after this line on this OS.
	# See http://docs.python.org/library/multiprocessing.html#windows
	if __name__ == '__main__':
	    # Execute ``fun(1)`` and ``fun(2)`` in parallel on two CPUs on this machine
	    # and return the result.
	    print map(fun, [1,2], cpu=2)

.. seealso:: Reference for :func:`map`, examples for :ref:`examples_independent`.


.. _guide_optimization:

Optimization
------------

Playdoh offers two functions :func:`minimize` and :func:`maximize` to quickly optimize
a Python objective function (fitness) in parallel across several CPUs and several
machines. Three optimization algorithms are available: the particle swarm optimization :class:`PSO`,
the covariance matrix adaptation evolution strategy :class:`CMAES`,
and an island genetic algorithm :class:`GA`


The following example shows how to maximize a Gaussian function in one dimension
across 2 CPUs::

	from playdoh import *
	import numpy
	
	# The fitness function to maximize
	def fun(x):
	    return numpy.exp(-x**2)
	
	if __name__ == '__main__':
	    # Maximize the fitness function in parallel
	    results = maximize(fun,
	                       popsize = 10000, # size of the population
	                       maxiter = 10, # maximum number of iterations
	                       cpu = 2, # number of CPUs to use on the local machine
	                       x_initrange = [-10.,10.]) # initial interval for the ``x`` parameter
	    
	    # Display the final results in a table
	    print_table(results)
    
.. seealso:: Reference for :func:`maximize`, :func:`minimize`, examples for :ref:`examples_optimization`.


.. _guide_loosely:

Loosely coupled parallel problems
---------------------------------

Some computational tasks cannot be distributed using the independent
parallel interface of Playdoh but require some communication between subtasks
and the introduction of synchronisation
points. Playdoh offers a simple programming
interface to do this. 

The following example shows an implementation of a numerical solver of a 
partial differential equation (the heat equation) in parallel across several CPUs.
There are two steps. 

* First, the task itself must be written: it is a 
  Python class which actually performs the computation. Every computing
  unit (node) stores and executes its own instance. 
  Communication between nodes happens through tubes, which are one-way named FIFO
  queues between two nodes. The source puts any Python object in the tube with
  a ``push``, and the target gets objects in the tube
  with a (blocking) ``pop``. This allows a simple implementation
  of synchronisation barriers.

* Then, the task launcher 
  executes on the client and launches the task on the CPUs on the local
  machine or on several machines across the network.
  It is done by calling the Playdoh function
  :func:`start_task`. Here, we launch the task on two CPUs on the local machine.
  The :func:`start_task` function triggers the instantiation of the 
  class on every node, the call to the ``initialize`` method with
  the arguments given in the ``args`` keyword argument,
  and finally the call to the ``start`` method.

The full script::

	from playdoh import *
	from numpy import *
	from pylab import * 
	
	# Any task class must derive from the ParallelTask
	class HeatSolver(ParallelTask):
	    def initialize(self, X, dx, dt, iterations):
	        # X is a matrix with the function values and the boundary values
	        # X must contain the borders of the neighbors ("overlapping Xs")
	        self.X = X 
	        self.n = X.shape[0]
	        self.dx = dx
	        self.dt = dt
	        self.iterations = iterations
	        self.iteration = 0
	
	    def send_boundaries(self):
	        # Send boundaries of the grid to the neighbors
	        if 'left' in self.tubes_out:
	            self.push('left', self.X[:,1])
	        if 'right' in self.tubes_out:
	            self.push('right', self.X[:,-2])
	    
	    def recv_boundaries(self):
	        # Receive boundaries of the grid from the neighbors
	        if 'right' in self.tubes_in:
	            self.X[:,0] = self.pop('right')
	        if 'left' in self.tubes_in:
	            self.X[:,-1] = self.pop('left')
	    
	    def update_matrix(self):
	        # Implement the numerical scheme for the PDE
	        Xleft, Xright = self.X[1:-1,:-2], self.X[1:-1,2:]
	        Xtop, Xbottom = self.X[:-2,1:-1], self.X[2:,1:-1]
	        self.X[1:-1,1:-1] += self.dt*(Xleft+Xright+Xtop+Xbottom-4*self.X[1:-1,1:-1])/self.dx**2
	
	    def start(self):
	        # Run the numerical integration of the PDE
	        for self.iteration in xrange(self.iterations):
	            self.send_boundaries()
	            self.recv_boundaries()
	            self.update_matrix()
	    
	    def get_result(self):
	        # Return the result
	        return self.X[1:-1,1:-1]
	
	def heat2d(n, iterations, nodes):
	    # ``split`` is the grid size on each node, without the boundaries
	    split = [(n-2)*1.0/nodes for _ in xrange(nodes)]
	    split = array(split, dtype=int)
	    split[-1] = n-2-sum(split[:-1])
	    
	    dx=2./n
	    dt = dx**2*.2
	    
	    # y is a Dirac function at t=0
	    y = zeros((n,n))
	    y[n/2,n/2] = 1./dx**2
	    
	    # Split y horizontally
	    split_y = []
	    j = 0
	    for i in xrange(nodes):
	        size = split[i]
	        split_y.append(y[:,j:j+size+2])
	        j += size
	    
	    # Define a double linear topology 
	    topology = []
	    for i in xrange(nodes-1):
	        topology.append(('right', i, i+1))
	        topology.append(('left', i+1, i))
	    
	    # Start the task
	    task = start_task(HeatSolver, # name of the task class
	                      cpu = nodes, # use ``nodes`` CPUs on the local machine
	                      topology = topology,
	                      args=(split_y, dx, dt, iterations)) # arguments of the ``initialize`` method
	                                              
	    # Retrieve the result, as a list with one element returned by ``HeatSolver.get_result`` per node
	    result = task.get_result()
	    result = hstack(result)
	    
	    return result
	
	if __name__ == '__main__':
	    result = heat2d(50, 100, 2)
	    hot()
	    imshow(result)
	    show()
	    
.. seealso:: Reference for :func:`start_task`, :class:`ParallelTask`, examples for :ref:`examples_loosely`.


.. _guide_machines:

=======================
Using several computers
=======================

.. _launching:

Launching the Playdoh server
----------------------------

Any computer within your local Ethernet network can be used to run computations
with Playdoh. First, Python and Playdoh must be installed. Then, the Playdoh
server must run so that computations can be submitted to it. Finally,
when you launch a task, you can specify the special keyword ``machines``
which is a list containing the IP addresses of the machines to use.

To launch the Playdoh server, you have two options.

Using Python
~~~~~~~~~~~~

Use the :func:`open_server` function to start the Playdoh server::

	# Open the server on the default port, using 4 CPUs and 1 GPU
	open_server(maxcpu=4, maxgpu=1)

You can close a server remotely using the :func:`close_servers` function::
    
    # Close the bobs-machine.university.com server
    close_servers(['bobs-machine.university.com'])


Using the command-line tool
~~~~~~~~~~~~~~~~~~~~~~~~~~~

You can use the ``playdoh`` command-line tool::

	# Open the server on the default port, using all CPUs and GPUs available
	playdoh open
	
	# Open the server with 2 CPUs and 1 GPU
	playdoh open 2 CPUs 1 GPU

You can also close servers and allocate resources using this script.
    
.. seealso:: Reference :ref:`ref_commandline`.


.. _sharing:

Sharing resources
-----------------

A single computer running the Playdoh server can be used py several clients
in parallel to execute different tasks. The computers' resources need to
be shared among the clients. To do this, each client begins by allocating
on the server
the number of CPUs he wants for his own computation, among all the idle CPUs
on this machine. You have three options.

Using Python
~~~~~~~~~~~~

Resource allocation can be done  using a few functions defined in Playdoh, most notably
:func:`get_available_resources` to get the resources available on a server, and
:func:`request_resources` to allocate resources on a server::

	# Get the available resources on the specified server
	available_resources = get_available_resources('bobs-machine.university.com')
	
	# Allocate 2 CPUs on the server
	request_resources('bobs-machine.university.com', CPU=2)

.. seealso:: Resource allocation example :ref:`example-resources`, reference :ref:`ref_allocation`.

Using the client GUI
~~~~~~~~~~~~~~~~~~~~

Resource allocation can be done with a GUI included in Playdoh and which can be run 
with the command ``playdoh_gui``.

Using the command-line tool
~~~~~~~~~~~~~~~~~~~~~~~~~~~

The command-line tool also allows to allocate resources on servers::
	
    # obtain the available resources on server 'bobs-machine.university.com'
    playdoh get bobs-machine.university.com
    
    # obtain all the allocated resources on the server
    playdoh get bobs-machine.university.com all
    
    # request 2 CPUs and 1 GPU for this client
    playdoh request bobs-machine.university.com 2 CPUs 1 GPU

.. seealso:: Reference :ref:`ref_commandline`.

Server-side
-----------

If you run a Playdoh server on your own computer, you can specify how many resources you allocated
to others. First, you can do that when you launch the Playdoh server (see :ref:`launching`).
Then, when a server is running on your machine, you can change the total number of available 
resources on your server with the function
:func:`set_total_resources`. Finally, you can also use the command-line tool, like::

	playdoh set 2 CPUs 1 GPU

.. seealso:: Reference :ref:`ref_commandline`.


=================
Advanced features
=================

.. _global:

Global variables
----------------

Playdoh defines several global variables.

``MAXCPU``
	The total number of CPUs detected on this computer

``MAXGPU``
	The total number of GPUs detected on this computer (PyCUDA must be installed). Note also that
	PyCUDA must be first initialized with a call to the function ``MAXGPU = initialise_cuda()``
	so that PyCUDA can obtain the number of GPUs available on the current system. You can also
	get the total number of GPUs without initializing PyCUDA on the current process with a call to
	:func:`get_gpu_count`. 

``DEFAULT_PORT``
	The default port to use for the Playdoh server.
	
``USERPREF``
	User preferences dictionary.
	
	.. seealso:: :ref:`userpref`.
	

.. _userpref:
	
User preferences
----------------

You can define some user preferences in the file ``~/.playdoh/userpref.py``.
The character ``~`` refers to your home directory, which should be ``/usr/<username>``
on Linux and ``C:\Users\<username\`` on Windows.

Current preferences are:

``authkey``
	Authentication key used to secure communications within the network. This value must
	be the same on every computer. By default, it is ``playdohauthkey``. You should generate 
	your own key and share it with anyone who might submit computations to your computer.
	The following code snippet shows a way of generating a random 256 bits authentication key
	in Python::
	
		import os, binascii
		authkey = binascii.hexlify(os.urandom(32))

``port``
	Default port used by the Playdoh server. It is 2718 by default.

``loglevel``
	Logging level. Can be ``'DEBUG'``, ``'INFO'`` (default) or ``'WARNING'``.

``favoriteservers``
	List of your servers' IP addresses and ports presented by default in the client GUI, 
	under the form ``IP:port``.	Default is ``[]``.

Here's an example of a valid user preferences file::

	USERPREF = {}
	USERPREF['port'] = 3141

To retrieve user preferences in the code, use the global variable ``USERPREF``
as a dictionary::

	from playdoh import *
	print USERPREF['port']


.. _gpu:

Using GPUs
----------

GPUs are natively supported by Playdoh through the PyCUDA package. It means that
your functions can load CUDA code dynamically and run it, so that you can use
several GPUs on a single or on several machines in parallel. GPUs can be used for both
independent parallel problems and loosely coupled parallel problems (including optimizations).

When loading CUDA code with PyCUDA, you can use the standard functions of PyCUDA to do it
but you should never initialize the GPU drivers yourself: Playdoh takes care of that
so that several GPUs can be handled transparently. Here's an example of a function using PyCUDA
and that can be safely distributed with Playdoh::

    import pycuda
    
    # The function loading the CUDA code
    def fun(scale):
        # The CUDA code, which multiplies a vector by a scale factor.   
        code = '''
        __global__ void test(double *x, int n)
        {
         int i = blockIdx.x * blockDim.x + threadIdx.x;
         if(i>=n) return;
         x[i] *= %d;
        }
        ''' % scale
        
        # Compile the CUDA code to GPU code
        mod = pycuda.compiler.SourceModule(code)
        
        # Transform the CUDA function into a Python function
        f = mod.get_function('test')
        
        # Create a vector on the GPU filled with 8 ones
        x = pycuda.gpuarray.to_gpu(ones(8))
        
        # Start the function on the GPU
        f(x, int32(8), block=(8, 1, 1))
        
        # Load the result from the GPU to the CPU
        y = x.get()
        
        # Finally, return the result
        return y
    
.. warning::

	On Linux, you may experience an issue with the CUDA code not compiling. You can fix this problem
	using ``do_redirect=True`` in the Playdoh function (:func:`map`, :func:`minimize`, etc.).
  
    
.. seealso:: The full example :ref:`example-gpu`, the `PyCUDA website <http://mathema.tician.de/software/pycuda>`__.


.. _allocation:

Resource allocation
-------------------

Resource allocation is the way computing units (CPUs and GPUs on machines)
are allocated to clients' computations. It can be done either manually
or automatically. In the latter case, one specifies the machines 
and the total number of computing units to use.

The main Playdoh functions accept
special keywords ``cpu``, ``gpu``, ``machines`` to
tell Playdoh how to automatically allocate available resources.
Also, they accept the keyword ``allocate`` to do resource allocation
manually. In this case, this keyword must accept an :class:`Allocation` object
returned by the function  :func:`allocate`. Manual resource allocation
is done by specifying the number of units to use on every machine.

The following example shows how to allocate automatically 10 CPUs on
two machines::

	from playdoh import *
	allocation = allocate(machines=['127.0.0.1', '127.0.0.2'], cpu=10)

This object can then be passed to :func:`map` or other Playdoh functions.

In the next example, resource allocation is done manually::

	from playdoh import *
	manual_alloc = {'127.0.0.1': 3, '127.0.0.2': 7}
	allocation = allocate(unit_type='CPU', allocation=manual_alloc)

.. seealso:: :func:`allocate`.


.. _shared_data:

Shared data
-----------

Nodes running on different computers need to have independent
copies of data in memory, but nodes running on different CPUs on a same computer
may have access to shared memory. With Playdoh, it is possible to store some
NumPy arrays in shared memory. This can be more efficient than having in memory
as many copies of one array as processes, especially with very 
large NumPy arrays. However, such shared arrays need to be read-only in order to avoid
contention issues when several processes try to make changes to the same data
at the same time.

Shared data can be used with :func:`map`, :func:`minimize` and
:func:`maximize` functions, using the ``shared_data`` keyword. 
This argument is a dictionary where keys are the item names,
and values are NumPy arrays (or any other type of data).
Then, the task to be executed can retrieve the shared data with the same
``shared_data`` keyword::

	from playdoh import *
	from numpy.random import rand
	
	def fun(..., shared_data):
	    largearray = shared_data['largearray']
    
	map(fun, ..., shared_data={'largearray': rand(1000000)})

.. seealso:: :ref:`example-map_shared`

With optimizations, the fitness function can also accept a special keyword ``shared_data``.


.. _code_transport:

Code transport
--------------

When distributing a Python function with Playdoh using several machines, the function's code
is automatically retrieved and sent to the machines. When the function imports external Python 
packages, these packages need to be installed on every machine.
When the function imports external Python modules (.py files), these modules must be
explicitely specified so that they are also transferred to the other machines. This is
done using the ``codedependencies`` special keyword in the main Playdoh functions.
This argument is a list with the modules' filenames, relatively to the main function location
in the filesystem.

The following example shows how to use the :func:`map` function with an import of 
an external module::

    from playdoh import *
      
    # Import an external module in the same folder
    from external_module import external_fun
      
    # The function to parallelize
    def fun(x):
        # Use the function defined in the external module
        return external_fun(x)**2
      
    # This line is required on Windows, any call to a Playdoh function
    # must be done after this line on this OS.
    # See http://docs.python.org/library/multiprocessing.html#windows
    if __name__ == '__main__':
        # Execute ``fun(1)`` and ``fun(2)`` in parallel on two CPUs on this machine
        # and return the result.
        # The ``codedependencies`` argument contains the list of external Python modules
        # to transfer on the machines executing the task. It is only needed when using
        # remote machines, and not when using CPUs on the local machine.
        print map(fun, [1,2], codedependencies=['external_module.py'])

This also works with the :func:`minimize` and :func:`maximize` functions.

.. seealso:: The full example :ref:`example-map_dependencies` and the reference of :func:`map`.


.. _optinfo:

Optimization information
------------------------

Some information about the optimization can be returned by the :func:`minimize`
and :func:`maximize` functions by specifying the `returninfo=True` special keyword.

.. seealso:: Reference for the :func:`minimize` and :func:`maximize` functions.


.. _groups:

Optimization groups
-------------------

Several groups of parameter populations can be optimized independently and 
in parallel with the same fitness function. This allows a vectorization 
of the fitness function for different optimization runs. The number of groups
is specified with the ``groups`` special keyword in the :func:`minimize` and 
:func:`maximize` functions. The fitness function can accept the ``groups``
keyword to get the number of groups. The total population on the node
is equally subdivided into ``groups`` subpopulations.

.. seealso:: :ref:`example-maximize_groups`, reference for the :func:`minimize` and :func:`maximize` functions.