
Programming Overview
====================

This file explains how a Quixote application is structured.

There are three components to a Quixote application:

1) A FastCGI script.  This script will create a Publisher object, and
   may customize the publisher in some application-specific way.  It
   will also tell the publisher what the root package name for the
   application is; for example, at the MEMS Exchange it's 'mems.ui',
   because all our Web code lives in the mems.ui package.  

   FastCGI isn't an absolute requirement.  It should be trivially easy
   to run Quixote apps using regular CGI, but this can be quite slow
   if very many modules need to be imported at startup time.  We
   haven't tried to run Quixote under mod_python/mod_snake, but it
   shouldn't be too difficult; if you try it, please let us know how
   it goes, and send us the steps you followed to get it working.

2) A configuration file.  This file specifies various features of the
   Publisher class, such as how errors are handled, the paths of
   various log files, and various other things.  Read through
   quixote/config.py for the full list of configuration settings.
   
   The most critical configuration parameters are:
      URL_PREFIX        Prefix of URLs that will be directed to Quixote.  
      ERROR_EMAIL	E-mail address to which errors will be mailed
      ERROR_LOG		File to which errors will be logged

3) Finally, the bulk of the code will be in a Python package; the
   Publisher class will be set up to start traversing at the package's
   root.  


FastCGI script
==============

The FastCGI script can be very simple:

        books.fcgi
	----------    
#!/usr/bin/python2.1
from quixote import imphooks
from quixote.config import Config
from quixote.publish import Publisher

PACKAGE_NAME = 'books'

def main():
    # Create configuration object with default values
    config = Config()

    # Read a configuration file
    config.read_file('/www/conf/books.conf')

    # Install the PTL import hook
    imphooks.install()

    # Create a Publisher instance using the configuration
    pub = Publisher(PACKAGE_NAME, config)

    # Enter the publishing main loop
    pub.publish_cgi()

if __name__ == '__main__': 
    main()
	----------    

That's the simplest possible case.  The UI code is kept in a package
named simply 'books' in this example, so its name is provided when
creating the Publisher instance.

The SessionPublisher class in quixote.publish can also be used; it
provides session tracking.  The changes required to use
SessionPublisher would be:

...
from quixote.publish import SessionPublisher
from quixote.session import SessionManager
...
    pub = SessionPublisher(PACKAGE_NAME, config)
    pub.set_session_manager( SessionManager() )    
...

It's also possible to subclass the Publisher or SessionManager classes
in order to provide some specialized behaviour necessary for your
application.  Some uses for this would be:

      * SessionManager stores user sessions in an in-memory
        dictionary, so restarting the FastCGI process will cause all
        the sessions to be lost.  The FastCGI process is terminated if
        the UI code raises an uncaught exception, so if sessions are
        used to contain important data, such as the contents of a
        shopping cart, it's better if sessions are stored in some
        persistent way.

      * If you're using the ZODB to store data, after each request
        you'll want to commit the current transaction if everything
        ran smoothly, or abort the transaction if an exception was
        raised.

      * The default behaviour on an uncaught exception is to record
        the time, the traceback, and the contents of the request.
        This is written to the configured error log (the ERROR_LOG
        parameter), and mailed to the configured e-mail address (the
        ERROR_EMAIL parameter).  If you wanted some different
        behaviour, you would have to subclass Publisher or
        SessionPublisher and override the finish_failed_request()
        method.

This file won't try to explain how to write subclasses of the Quixote
classes; read the docstrings in the code for detailed explanations.

The FastCGI script must then be copied to the CGI directory for your
Web server, and marked as executable.  Consult your Web server
documentation to find out how to set up a CGI directory, and how to
mark the script as requiring FastCGI.

If you're using Apache, you should compile it with mod_fastcgi and
mod_rewrite enabled, and then the following snippets can be added to
your httpd.conf file:

         Excerpt from Apache httpd.conf
         ------------------------------
# FastCGI configuration directives for Quixote
FastCgiIpcDir /www/httpd/var
FastCgiServer /www/cgi-bin/books.fcgi -socket fcgi.sock 
AddHandler fastcgi-script .fcgi
         ------------------------------

With this configuration, Quixote would be accessed using the URL
http://.../cgi-bin/books.fcgi/x/y/z.  A shorthand form for accessing
Quixote can be added using the following mod_rewrite directive, which
makes http://.../q/x/y/z equivalent to http://.../cgi-bin/quixote/x/y/z.

         Excerpt from Apache httpd.conf
         ------------------------------
# Rewrite rule to make a shorthand for access to Quixote
RewriteRule ^/q/(/$|.*) /www/cgi-bin/quixote.fcgi$1 [t=fastcgi-script,l]
RewriteEngine on
         ------------------------------

If ^/(.*)$ is used as the pattern, then all accesses to the site will
be sent to Quixote, and no /q/ prefix is needed.  In this case, you'll
probably have to make some exceptions for URLs that shouldn't be
directed to Quixote; images, CSS files, and the robots.txt file are
commonly left as static files.  This is done by adding the following
line; it must come *before* the previous RewriteRule so it can stop 
mod_rewrite engine from trying further patterns.  

         Excerpt from Apache httpd.conf
         ------------------------------
# Don't rewrite URLs to static content
RewriteRule ^/(robots.txt|base.css|images|icons) - [last]
         ------------------------------


Configuration file
==================

In the example books.fcgi script, configuration information is read
from a file by this line of code:
    config.read_file('/www/conf/books.conf')

You should never edit the default values in quixote/config.py, because
your edits will be lost if you upgrade to a newer Quixote version.
You should certainly read it, though, to understand what all the configuration
parameters are.

The configuration file contains Python code, which is then evaluated
using Python's built-in function execfile().  Variable assignments are
performed within the Config object's dictionary, so it's easy to set
values:

        books.conf
	----------
ACCESS_LOG = "/www/log/access/quixote.log" 
DEBUG_LOG = "/www/log/quixote-debug.log"
ERROR_LOG = "/www/log/quixote-error.log"

You can also execute arbitrary Python code to figure out what the
variables should be.  The following example changes some settings to
be more convenient for a developer when the MX_MODE environment
variable is the string 'DEVEL':

mx_mode = os.environ["MX_MODE"]
if mx_mode == "DEVEL":
    DISPLAY_EXCEPTIONS = 1
    SECURE_ERRORS = 0
    RUN_ONCE = 1
elif mx_mode in ("STAGING", "LIVE"):
    DISPLAY_EXCEPTIONS = 0
    SECURE_ERRORS = 1
    RUN_ONCE = 0
else:
    raise RuntimeError, "unknown server mode: %s" % mx_mode

We use this flexibility to display tracebacks in DEVEL mode, to
redirect generated e-mails to a staging address in STAGING mode, and
to enable all features in LIVE mode.


'books' Package
===============

Finally, we reach the most complicated part of a Quixote application.
However, thanks to Quixote's design, everything you've ever learned
about designing and writing Python code should be applicable, so there
are no new hoops to jump through.   

An application's code lives in a Python package that contains both .py
and .ptl files.  Complicated logic should be in .py files, while .ptl
files, ideally, should contain only the logic needed to render your
Web interface and basic objects as HTML.  

Quixote's publisher will start at the root of this package, and will
treat the rest of the URL as a path into the package's contents.  Here
are some examples, assuming that the URL_PREFIX is '/q', and the root
package is 'books'

http://.../q/             will call    books._q_index()
http://.../q/other        will call    books.other(), if books.other
			               is a function.
http://.../q/other        will call    books.other._q_index(), if books.other
			               is a module or a subpackage.

One of PTL's design goals is "Be explicit."  Therefore there's no
complicated rule for remembering which functions in a module are
public; you just have to list them all in the _q_exports variable,
which should be a list of strings naming the public functions.  You
don't need to list the _q_index function as being public; that's
assumed.

	books/__init__.py
	-----------------

_q_exports = ["other"]

from pages import _q_index

def other(request):
    return "Handled by a Python function."

When a function is callable from the Web, it must expect a single
parameter, which will be an object containing the contents of the HTTP
request.  'request' will be an instance of the HTTPRequest class, and
provides methods for reading form values, environment variables, and
the usual CGI-ish data.  When using SessionPublisher, request.session
is a Session object for the user agent making the request.

Use 'pydoc quixote.zope.HTTPRequest' to get a full listing of
HTTPRequest's methods.

The function must return either a string or a TemplateIO object; PTL
templates return a TemplateIO object.  request.response is an
HTTPResponse instance, which has methods for setting the content-type
of the function's output, generating an HTTP redirect, specifying
arbitrary HTTP response headers, and other common tasks.  Use 'pydoc
quixote.zope.HTTPResponse' to get a full listing of HTTPResponse's
methods.

There are two (and *only* two) ways to affect the Publisher's
traversal.  

_q_access(request)

   If this function is present in a module, it will be called before
   attempting to traverse any further.  It can look at the contents of
   request and decide if the traversal can continue; if not, it should
   raise quixote.errors.AccessError (or a subclass), and Quixote will
   return a 403 Forbidden HTTP status code.  The return value is
   ignored if _q_access() doesn't raise an exception.

   For example, in the MEMS Exchange code, we have some sets of pages
   that are only accessible to signed-in users of a certain type.  The
   _q_access() function looks like this:

def _q_access (request):
    if request.session.user is None:
        raise NotLoggedInError, ("You must be signed in to view reports.")
    if not (request.session.user.is_MX() or
            request.session.user.is_fab()):
        raise MXAccessError, ("Only MEMS Exchange and fab staff can view "
                              "reports.")

   This is less error-prone than having to remember to add checks to 
   every single public function.


_q_getname(request, component)

   This function translates an arbitrary string into an object that we
   continue traversing.  This is very handy; it lets you put
   user-space objects into your URL-space, eliminating the need for
   digging ID strings out of a query, or checking PATHINFO after
   Quixote's done with it.  But it is a compromise with security: it
   opens up the traversal algorithm to arbitrary names not listed in
   _q_exports.  You should therefore be extremely paranoid about
   checking the value of 'component'.

   'request' is the request object, as it is everywhere else;
   'component' is a string containing the next chunk of the path.
   _q_getname() should return some object that can be traversed
   further, so it should have a _q_index() method, a _q_exports
   attribute, and optionally _q_access() or its own _q_getname().
   We generally write special classes for this purpose, though you
   could choose a particular module and return that instead.
       
   For example, we want people to be able to go to
   http://.../q/run/250/ to view run #250.  This is more readable than
   the alternatives '/q/run/?id=250' or even '/q/run?250'.  The
   corresponding function and class look like this:

def _q_getname (request, component):
    return RunUI(request, component)

class RunUI:
    _q_exports = ['details']

    def __init__ (self, request, component):
        run_id = int(component)
        run_db = get_run_database()
        self.run = run_db.get_run(run_id, run_version) 
        if not self.run.can_access(request.session.user):
            raise MXAccessError("You are not allowed to access run %d." %
	                        run_id)

    def _q_index (self, request):
        ...
    def details (self, request):
        ...

The __init__() method is actually much longer, and is very paranoid
about checking whether the value of 'component' is actually a number,
if the run exists, and if the user is permitted to view that run.


-- 
A.M. Kuchling    <akuchlin@mems-exchange.org>
Neil Schemenauer <nascheme@mems-exchange.org>
Greg Ward        <gward@python.net>


