Running the Quixote Demo
========================

Quixote comes with a tiny demonstration application that you can install
and run on your web server.  In a few dozen lines of Python and PTL
code, it demonstrates most of Quixote's basic capabilities.  It's also
an easy way to make sure that your Python installation and web server
configuration are cooperating so that Quixote applications can work.


Installation
------------

The demo is included in the quixote.demo package, which is installed
along with the rest of Quixote when you run "python setup.py install".
The driver script (demo.cgi) and associated configuration file
(demo.conf) are *not* installed automatically -- you'll have to copy
them from the demo/ subdirectory to your web server's CGI directory.
Eg., if you happen to use the same web server tree as we do:

  cp -p demo/demo.cgi demo/demo.conf /www/cgi-bin

You'll almost certainly need to edit the "#!" line of demo.cgi to ensure
that it points to the correct Python interpreter -- it should be the
same interpreter that you used to run "setup.py install".


Verifying the installation
--------------------------

Before we try to access the demo via your web server, let's make sure
that the quixote and quixote.demo packages are installed on your system:

  $ python
  Python 2.1.1 (#2, Jul 30 2001, 12:04:51) 
  [GCC 2.95.2 20000220 (Debian GNU/Linux)] on linux2
  Type "copyright", "credits" or "license" for more information.
  >>> import quixote
  >>> quixote.enable_ptl()
  >>> import quixote.demo

(Quixote requires Python 2.0 or greater; you might have to name an
explicit Python interpreter, eg. "/usr/local/bin/python2.1".  Make sure
that the Python interpreter you use here is the same as you put in the
"#!" line of demo.cgi, and the same that you used to install Quixote.)

If this runs without errors, then Quixote (and its demo) are installed
such that you can import them.  It remains to be seen if the user that
will run the driver script -- usually "nobody" -- can import them.


Running the demo directly
-------------------------

Assuming that
  * your web server is running on the current host
  * your web server is configured to handle requests to
    /cgi-bin/demo.cgi by running the demo.cgi script that you just
    installed (eg. to /www/cgi-bin/demo.cgi)
you should now be able to run the Quixote demo by directly referring
to the demo.cgi script.

Start a web browser and load
  http://localhost/cgi-bin/demo.cgi/

You should see a page titled "Quixote Demo" with the headline "Hello,
world!".  If not, go look in your web server's error log.  Some things
that might go wrong:
    
  * your web server is not configured to run CGI scripts, or it
    might use a different base URL for them.  If you're running
    Apache, look for something like
      ScriptAlias /cgi-bin/ /www/cgi-bin/
    in your httpd.conf.

    (This is not a problem with Quixote or the Quixote demo; this is a
    problem with your web server's configuration.)

  * your web server was unable to execute the script.  Make sure
    its permissions are correct:
      chmod 755 /www/cgi-bin/demo.cgi

    (This shouldn't happen if you install demo.cgi with "cp -p" as
    illustrated above.)

  * demo.cgi started, but was unable to import the Quixote modules.
    In this case, there should be a short Python traceback in your web
    server's error log ending with a message like "ImportError: No
    module named quixote".

    Remember, just because you can "import quixote" in a Python
    interpreter doesn't mean the user that runs CGI scripts (usually
    "nobody") can.  You might have installed Quixote in a non-standard
    location, in which case you should either install it in the standard
    location (your Python interpreter's "site-packages" directory) or
    instruct your web server to set the PYTHONPATH environment variable.
    Or you might be using the wrong Python interpreter -- check the "#!" 
    line of demo.cgi.

  * demo.cgi started and imported Quixote, but was unable to read its
    config file.  There should be a short Python traceback in your web
    server's error log ending with a message ike "IOError: [Errno 2] No
    such file or directory: 'demo.conf'" in this case.

    Make sure you copied demo.conf to the same directory as demo.cgi,
    and make sure its readable:
      chmod 644 /www/cgi-bin/demo.conf

    (This shouldn't happen if you install demo.conf with "cp -p" as
    illustrated above.)


Running the demo indirectly
---------------------------

One of the main tenets of Quixote's design is that, in a web
application, the URL is part of the user interface.  We consider it
undesirable to expose implementation details -- such as
"/cgi-bin/demo.cgi" -- to users.  That sort of thing should be tucked
away out of sight.  Depending on your web server, this should be doable
with a simple tweak to its configuration.

For example, say you want the "/qdemo" URL to be the location of the
Quixote demo.  If you're using Apache with the rewrite engine loaded and
enabled, all you need to do is add this to your httpd.conf: RewriteRule
^/qdemo/(.*) /www/cgi-bin/demo.cgi$1 [l]

With this rule in effect (don't forget to restart your server!),
accesses to "/qdemo/" are the same as accesses to "/cgi-bin/demo.cgi/" --
except they're a lot easier for the user to understand and don't expose
implementation details of your application.

Try it out.  In your web browser, visit:
  http://localhost/qdemo/

You should get exactly the same page as you got visiting
"/cgi-bin/demo.cgi/" earlier, and all the links should work exactly the
same.

You can use any URL prefix you like -- there's nothing special about
"/qdemo".  Well, there is *one* special thing about it: it's the default
value for the URL_PREFIX configuration variable.  If you use a different
prefix in your web server's configuration, say "/foo", be sure to set
URL_PREFIX in demo.conf:
  URL_PREFIX = "/foo"

If you don't, Quixote will generate incorrect internal redirects.
(URL_PREFIX is just used to tell Quixote what you've already told your
web server.  If you don't keep the two in sync, you'll have problems
whenever Quixote needs to generate a redirect within your application.)

One small but important detail here is "/qdemo" versus "/qdemo/".  In
the above configuration, requests for "/qdemo" will fail, and requests
for "/qdemo/" will succeed.  See the "URL rewriting" section of
web-server.txt for details and how to fix this.


Understanding the demo
----------------------

Now that you've gotten the demo to run successfully, let's look under
the hood and see how it works.  Before we start following links in the
demo (don't worry if you already have, you can't hurt anything), make
sure you're watching all the relevant log files.  As with any web
application, log files are essential for debugging Quixote applications.

Assuming that your web server's error log is in /www/log/error_log, and
that you haven't changed the DEBUG_LOG and ERROR_LOG settings in
demo.conf:

  $ tail -f /www/log/error_log & \
    tail -f /tmp/quixote-demo-debug.log & \
    tail -f /tmp/quixote-demo-error.log 

(Note that recent versions of GNU tail let you tail multiple files with
the same command.  Cool!)

[Lesson 1: the top page]

Reload the top of the demo, presumably http://localhost/qdemo/.  You
should see "debug message from the index page" in the debug log file.

Where is this message coming from?  To find out, we need to delve into
the source code for the demo.  Load up demo/__init__.py and let's take a
look.  In the process, we'll learn how to explore a Quixote application
and find the source code that corresponds to a given URL.

First, why are we loading demo/__init__.py?  Because that's where some
of the names in the "quixote.demo" namespace are defined, and it's where
the list of names that may be "exported" by Quixote from this namespace
to the web is given.  Recall that under Quixote, every URL boils down to
a callable Python object -- usually a function or method.  The root of
this application is a Python package ("quixote.demo"), which is just a
special kind of module.  But modules aren't callable -- so what does the
"/qdemo" URL boil down to?  That's what '_q_index()' is for -- you can
define a special function that is called by default when Quixote
resolves a URL to a namespace rather than a callable.  That is, "/qdemo"
resolves to the "quixote.demo" package; a package is a namespace, so it
can't be called; therefore Quixote looks for a function called
'_q_index()' in that namespace and calls it.

In this case, '_q_index()' is not defined in demo/__init__.py -- but it
is imported there from the quixote.demo.pages module.  This is actually
a PTL module -- demo/pages.py does not exist, but demo/pages.ptl does.
So load it up and take a look:

  template _q_index(request):
      print "debug message from the index page"
      """
      <html>
      <head><title>Quixote Demo</title></head>
      <body>
      <h1>Hello, world!</h1>
      [...]
      </body>
      </html>
      """

A-ha!  There's the PTL code that generates the "Quixote Demo" page.
This '_q_index()' template is quite simple PTL -- it's mostly an HTML
document with a single debug print thrown in to demonstrate Quixote's
debug logging facility.

Outcome of lesson 1:
  * a URL maps to either a namespace (package, module, class instance) 
    or a callable (function, method, PTL template)

  * if a URL maps to a namespace, Quixote looks for a callable 
    '_q_index()' in that namespace and calls it

  * '_q_index()' doesn't have to be explicitly exported by your 
    namespace; if it exists, it will be used

  * anything your application prints to standard output goes to
    Quixote's debug log.  (If you didn't specified a debug log in
    your config file, debug messages are discarded.)


[Lesson 2: a link to a simple document]

The first two links in the "Quixote Demo" page are quite simple.  Each
one is handled by a Python function defined in the "quixote.demo"
namespace, i.e. in demo/__init__.py.  For example, following the
"simple" link is equivalent to calling the 'simple()' function in
"quixote.demo".  Let's take a look at that function:

  def simple (request):
      request.response.setHeader("Content-type", "text/plain")
      return "This is the Python function 'quixote.demo.simple'.\n"

Note that this could equivalently be coded in PTL:

  template simple (request):
      request.response.setHeader("Content-type", "text/plain")
      "This is the Python function 'quixote.demo.simple'.\n"

...but for such a simple document, why bother?

Since this function doesn't generate an HTML document, it would be
misleading for the HTTP response that Quixote generates to claim a
"Content-type" of "text/html".  That is the default for Quixote's HTTP
responses, however, since most HTTP responses are indeed HTML documents.
Therefore, if the content you're returning is anything other than an
HTML document, you should set the "Content-type" header on the HTTP
response.

This brings up a larger issue: request and response objects.  Quixote
includes two classes, HTTPRequest and HTTPResponse, to encapsulate every
HTTP request and its accompanying response.  Whenever Quixote resolves a
URL to a callable and calls it, it passes precisely one argument: an
HTTPRequest object.

The HTTPRequest object includes (almost) everything you might want to
know about the HTTP request that caused Quixote to be invoked and to
call a particular function, method, or PTL template.  You have access to
CGI environment variables, HTML form variables (parsed and
ready-to-use), and HTTP cookies.  Finally, the HTTPRequest object also
includes an HTTPResponse object -- after all, every request implies a
response.  You can set the response status, set response headers, set
cookies, or force a redirect using the HTTPResponse object.

Note that it's not enough that the 'simple()' function merely exists.
If that were the case, then overly-curious users or attackers could
craft URLs that point to any Python function in any module under your
application's root namespace, potentially causing all sorts of havoc.
You need to explicitly declare which names are exported from your
application to the web, using the '_q_exports' variable.  For example,
demo/__init__.py has this export list:
  _q_exports = ["simple", "error"]

This means that only these two names are explicitly exported by the
Quixote demo.  (The empty string is implicitly exported from a namespace
if a '_q_index()' callable exists there -- thus "/qdemo/" is handled by
'_q_index()' in the "quixote.demo" namespace.  Arbitrary names may be
implicitly exported using a '_q_getname()' function; see Lesson 4
below.)


[Lesson 3: error-handling]

The next link in the "Quixote Demo" page is to the "error" document,
which is handled by the 'error()' function in demo/__init__.py.  All
this function does is raise an exception:

  def error (request):
      raise ValueError, "this is a Python exception"

Follow the link, and you should see a Python traceback followed by a
dump of the CGI environment for this request (along with other request
data, such as a list of cookies).

This is extremely useful when developing, testing, and debugging.  In a
production environment, though, it reveals way too much about your
implementation to hapless users who should happen to hit an error, and
it also reveals internal details to attackers who might use it to crack
your site.  (It's just as easy to write an insecure web application with
Quixote as with any other tool.)

Thus, Quixote offers the DISPLAY_EXCEPTIONS config variable.  This is
false by default, but the demo.conf file enables it.  To see what
happens with DISPLAY_EXCEPTIONS off, edit demo.conf and reload the
"error" page.  You should see a bland, generic error message that
reveals very little about your implementation.  (This error page is
deliberately very similar, but not identical, to Apache's "Internal
Server Error" page.)

Unhandled exceptions raised by application code (aka "application bugs")
are only one kind of error you're likely to encounter when developing a
Quixote application.  The other ones are:

  * driver script crashes or doesn't run (eg. can't import quixote
    modules, can't load config file).  This is covered under
    "Running the demo directly" above

  * publishing errors, such as a request for "/simpel" that should
    have been "/simple", or a request for a resource that exists but is
    denied to the current user.  Quixote has a family of exception
    classes for dealing with these; such exceptions may be raised by
    Quixote itself or by your application.  They are handled by Quixote
    and turned into HTTP error responses (4xx status code).

  * bugs in Quixote itself; hopefully this won't happen, but you
    never know.  These usually look a lot like problems in the driver
    script: the script crashes and prints a traceback to stderr, which
    most likely winds up in your web server's error log.  The length of
    the traceback is generally a clue as to whether there's a problem
    with your driver script or a bug in Quixote.

Publishing errors result in a 4xx HTTP response code, and are entirely
handled by Quixote -- that is, your web server just returns the HTTP
response that Quixote prepares.

Application bugs result in a 5xx HTTP response code, and are similarly
entirely handled by Quixote.  Don't get confused by the fact that
Quixote's and Apache's "Internal Server Error" pages are quite similar!

Driver script crashes and Quixote bugs (which are essentially the same
thing; the main difference is who to blame) are handled by your web
server.  (In the first case, Quixote doesn't even enter into it; in the
second case, Quixote dies horribly and is no longer in control.)  Under
Apache, the Python traceback resulting from the crash is written to
Apache's error log, and a 5xx response is returned to the client with
Apache's "Internal Server Error" error page.


[Lesson 4: object publishing]

Publishing Python callables on the web -- i.e., translating URLs to
Python functions/methods/PTL templates and calling them to determine the
HTTP response -- is a very powerful way of writing web applications.
However, Quixote has one more trick up its sleeve: object publishing.
You can translate arbitrary names to arbitrary objects which are then
published on the web, and you can create URLs that call methods on those
objects.

This is all accomplished with the '_q_getname()' function.  Every
namespace that Quixote encounters may have a '_q_getname()', just like
it may have a '_q_index()'.  '_q_index()' is used to handle requests for
the empty name -- as we saw in Lesson 1, a request for "/qdemo/", maps to
the "quixote.demo" namespace; the empty string after the last slash
means that Quixote will call '_q_index()' in this namespace to handle
the request.

'_q_getname()' is for requests that aren't handled by a Python callable
in the namespace.  As seen in Lessons 2 and 3, requests for
"/qdemo/simple" and "/qdemo/error" are handled by the 'simple()' and
'error()' functions in the "quixote.demo" namespace.  What if someone
requests "/qdemo/foo"?  There's no function 'foo()' in the
"quixote.demo" namespace, so normally this would be an error.
(Specifically, it would be a publishing error: Quixote would raise
TraversalError, which is the error used for non-existent or non-exported
names.  Another part of Quixote then turns this into an HTTP 404
response.)

However, this particular namespace also defines a '_q_getname()'
function.  That means that the application wants a chance to handle
unknown names before Quixote gives up entirely.  Let's take a look at
the implementation of '_q_getname()':

  from quixote.demo.integer_ui import IntegerUI
  [...]
  def _q_getname(request, component):
      return IntegerUI(request, component)

Pretty simple: we just construct an IntegerUI object and return it.  So
what is IntegerUI?  Take a look in the demo/integer_ui.py file to see;
it's just a web interface to integers.  (Normally, you would write a
wrapper class that provides a web interface to something more
interesting than integers.  This just demonstrates how simple an object
published by Quixote can be.)

So, what is an IntegerUI object?  From Quixote's point of view, it's
just another namespace to publish: like modules and packages, class
instances have attributes, some of which (methods) are callable.  In the
case of IntegerUI, two of those attributes are '_q_exports' and
'_q_index' -- every namespace published by Quixote must have an export
list, and an index function is almost always advisable.

What this means is that any name that the IntegerUI constructor accepts
is a valid name to tack onto the "/qdemo/" URL.  Take a look at the
IntegerUI constructor; you'll see that it works fine when passed
something that can be converted to an integer (eg. "12" or 1.0), and
raises Quixote's TraversalError if not.  As it happens, Quixote always
passes in a string -- URLs are just strings, after all -- so we only
have to worry about things like "12" or "foo".

The error case is actually easier to understand, so try to access
  http://localhost/qdemo/foo/
You should get an error page that complains about an "invalid literal
for int()".

Now let's build a real IntegerUI object and see the results.  Follow the
third link in the "Quixote Demo" page, or just go to
  http://localhost/qdemo/12/
You should see a web page titled "The Number 12".

This web page is generated by the '_q_index()' method of IntegerUI:
after all, you've selected a namespace (the IntegerUI object
corresponding to the number 12) with no explicit callable, so Quixote
falls back on the '_q_index()' attribute of that namespace.

IntegerUI only exports one interesting method, 'factorial()'.  You can
call this method by following the "factorial" link, or just by accessing
  http://localhost/qdemo/12/factorial

Remember how I said the URL is part of the user interface?  Here's a
great example: edit the current URL to point to a different integer.  A
fun one to try is 2147483646.  If you follow the "next" link, you'll get
an OverflowError traceback (unless you're using a 64-bit Python!),
because the web page for 2147483647 attempts to generate its own "next"
link to the web page for 2147483648 -- but that fails because current
versions of Python on 32-bit platforms can't handle regular integers
larger than 2147483647.

Now go back to the page for 2147483646 and hit the "factorial" link.
Run "top" on the web server.  Get yourself a coffee.  Await the heat
death of the universe.  (Actually, your browser will probably timeout
first.)  This doesn't overflow, because the factorial() function uses
Python long integers, which can handle any integer -- they just take a
while to get there.  However, it illustrates another interesting
vulnerability: an attacker could use this to launch a denial-of-service
attack on the server running the Quixote demo.  (Hey, it's just a demo!)

Rather than fix the DoS vulnerability, I decided to use it to illustrate
another Quixote feature: if you write to stderr, the message winds up in
the Quixote error log for this application (/tmp/quixote-demo-error.log
by default).  The IntegerUI.factorial() method uses this to log a
warning of an apparent denial-of-service attack:

  def factorial (self, request):
      if self.n > 10000:
          sys.stderr.write("warning: possible denial-of-service attack "
                           "(request for factorial(%d))\n" % self.n)
      return "%d! = %d" % (self.n, fact(self.n))

Since the Quixote error log is where application tracebacks are
recorded, you should be watching this log file regularly, so you would
presumably notice these messages.

In real life, you'd probably just deny such a ludicrous request.  You
could do this by raising a Quixote publishing error.  For example:

  def factorial (self, request):
      from quixote.errors import AccessError
      if self.n > 10000:
          raise AccessError("ridiculous request denied")
      return "%d! = %d" % (self.n, fact(self.n))
