$Cambridge: hermes/src/prayer/docs/ROADMAP,v 1.2 2008/09/16 09:59:56 dpc22 Exp $

A brief overview of Prayer
==========================

A function by function breakdown should eventually appear in TechDoc. This
is just a quick guide to the files that you will see in the distribution.

Finding your way around the distribution
========================================

README  <--> docs/README

INSTALL <--> docs/INSTALL

docs/
  README:      Introduction to Prayer.

  INSTALL:     Installation guide.

  FEATURES:    A brief list of features for people wondering just what is
               different about this paricular Webmail package.

  LICENSE:     Copy of the GNU GENERAL PUBLIC LICENSE

  NOTICE:      Describes distribution terms and incorporated code.
 
  ROADMAP:     An overview of the software distribution. This file!

  TODO:        The current wishlist

  DONE:        Items moved from the TODO list. Should probably use a ChangeLog.

  ICONLIST:    A list of sources for various icons used by Prayer.

  SECURITY:    Some thoughts about HTML security. Need to update XXX

  DESIGN:      Some notes and discussion about the Prayer design

  URL_OPTIONS: Modifiers for login URLs

  CMD_LINE:    Command line options for prayer and other binaries: modify
               behaviour add debugging etc

  LOGS:        Describes format of the various log files.

Makefile:
  Makefile which will install software from the subdirectories

Config:
  Configuration file used by all subsiduary Makefiles.

Config-RPM:
  Version of Configuration file tailored for RedHat RPM build process
  (triggered if RPM_BUILD=true passed as argument to make)

defaults:
  Default versions of Config and Config-RPM (restored by "make distclean"),
  plus a RPM spec file for Prayer.

accountd/
  Source code for the accountd support daemon.

prayer/
  Source code for the main prayer daemons.

Code layout for prayer/ directory
=================================

There are lots of files in this directory: I like to split tasks up into
lots of small pieces. However, I've tried quite hard to ensure that
external (i.e non-static functions) have sensible names so that they are
easy to find.  Typically you should be able to rely on the fact that a
given function e.g: pool_printf() will be defined in pool.c, with a
prototype in the header file pool.h. The only real exception to this rule
are c-client calls (which typically start mail_XXX) and a bunch of macros
that I have defined for seven very common functions.

   Macro          Corresponds to function
   =====          =======================

   bputc          buffer_putchar
   bputs          buffer_puts
   bprint         buffer_printf

   ioputc         iostream_putchar
   ioputs         iostream_puts
   ioprintf       iostream_printf
   iogetc         iostream_getchar

Common support modules
======================

pool: Memory pool.  
  Defines a group of arbitary size memory assignments. A pool is created
  using pool_create(). Allocations are made using a number of different
  functions including pool_alloc(), pool_strdup(), pool_strcat() and
  pool_printf(). Large blocks are allocated separately, small allocations
  come from aggregate memory blocks. pool_free() frees all of the memory
  assigned in the pool. The NIL pool is special: it allocated against
  conventional memory.

list: Single linked list
  Defines generic list structure including add, remove and lookup (indexed
  by element name and by list offset). Lots of data structures "inherit"
  from list using "struct list" as a binary header. Expect to see a lot of
  for loops using generic list items that immediate cast the list_item
  pointer to a more specific list type. Lists are built on pools.

assoc: Associative array (aka hash)
  Perl style associative array built on pools, with add, remove and lookup
  methods. Was originally called hash, but the c-client IMAP 2001 toolkit
  defines its own "hash" type.

buffer: Arbitary length (typically large) strings with linear access methods
  Buffers are arbitary length strings built on pools which can extend
  indefinitely (typically using the bputc, bputs and bprintf macros).
  There are also functions to cast buffers or ranges of buffers into C
  "char *" strings in the same or a different pool. Used all over the
  place, mostly a comment on C string handling. Possible that "buffer"
  should be called something else to reflect widespread use within Prayer.

memblock: Arbitary length (typically small) strings.
  A memblock is a contiguous block of memory that can be resized. Typically
  used for status messages and other feedback from Prayer to the users:
  strings that are typically quite short, but which may have arbitary size.
  Memblocks are useful for entities which persist across HTTP requests.
  Buffers are typically more useful for shortlives entities which may grow.

user_agent: User agent properties and features
  A collection of features typically boolean flags that define capabilities
  of a particular browser. Is also use to restrict or force use of various
  optimisations and debugging techniques.

iostream: Input/Output abstraction
  Simple replacement for stdio which provides: bidirectional I/O on sockets
  (stdio didn't seem to work for me), transparent SSL support, consistent
  error handling (I hope) and timeouts on both input and output requests
  Actually included by two simple wrapper classes iostream_prayer and
  iostream_session so that we can include SSL support for the two halves
  independantly at compile time.

request: HTTP request parsing
  Routines to parse HTTP requests: method, headers and body. Also includes
  some auxillary routines for chopping up HTTP POST forms and file uploads.
  The request struct includes response information (see next section).

response: HTTP response generation.
  Assorted routines used to generate HTTP response header for data that is
  queued in request->write_buffer e.g: response_html() response_raw().
  Also includes routines for actually sending the response header and body
  over a nominated iostream plus session telemetry. This second part is
  largely historical: possible that it should be split off now.

html_common: HTML markup routines used by frontend/error responses
  Just factoring out some common code which might happen to change...

setproctitle: Argv mangling
  A bunch of routines stolen from Sendmail (eek!) for setting process
  header information to indicate what is going on. Will only work on
  certain platforms, but the more feedback that you can get the better...

ipaddr: IP address manipulation
  A small class for manipulating IP addresses (convert to canonical form,
  convert to and from  string, compare). Idea is to hide IP version
  specific information, however we still have to implement IPv6 addresses!

log: Logging functions
  Functions for logging information at different priorities. There are also
  a bunch obsolete logging functions specific to the prayer and session
  systems which are now built on top of log_XXX functions. We should
  probably try to phase these out now.

os: OS specific support functions.
  Actually #includes an OS specific file to avoid #ifdef soup within a
  single file. A wide range of function to set up Unix and Internet and
  Unix domain sockets, provide locking etc. Will only grow as time go on.
  The big idea is to try and avoid OS specific behaviour anywhere else
  within Prayer.

config: Prayer configuration file
 struct config plus a series of routines for parsing and then testing
 the prayer configuration file with any overrides.

Prayer Frontend Specific modules
================================

prayer_login:
  Routines responsible for generating initial login screen and processing
  subsequent HTTP POST request for login

prayer_server: Main Frontend server module
  Simple and Prefork models which accept incoming HTTP requests and service
  them. Three classes of HTTP request: icons, generate/process login screen
  and requests which should be proxied through to the session process

prayer_main: Parses command line options.
  Reads in configuration file from one of three places. Overrides
  configuration options from command line.  Checks configuration. Binds to
  specified HTTP ports. If running as root: lose root privileges and
  re-exec self with "--ports" command line option to prevent core dump
  paranoia (goto line 1!). Starts prayer_server()

Prayer Backend (Session) support modules
========================================

session: Main session state
  Lots and lots of state specific to this login session. Basically global
  state: lots of other modules dig in around struct session without a well
  defined interface. However this approach makes it rather easier to move
  to a single, massively multithreaded, process if the urge ever strikes.
  It also means that all of the "global" state is in a well defined place.

addr: Address parsing routines.
  Routines for splitting up RFC822 addresses. Relies heavily on c-client
  to do the hard work at the moment

ml: c-client interface 
  Abstraction to the c-client mail_XXX with some automatic handling for
  some of the more obvious error conditions e.g: TRYCREATE.  Callers also
  provide a struct session which provides a number of hooks for logging
  and user info callbacks. ml and mm are the only modules in Prayer which
  have their own global state: this is an unfortunate consequence of the
  c-client design.

mm: C-client callbacks
  Callback routines used by c-client. Linked strongly to the ml module.

stream: MAILSTREAM support
  A small number of helper functions for c-client MAILSTREAM objects.
  Should be merged into some generic c-client wrapper?

dirlist: Directory Listing
  Cache for specific directory. Part of the dircache module (see below).

dircache: Directory Cache
  A generalised tree structure directory cache. Used to cache directory
  structure for this user to optimise out round trips to the IMAP server.

html: HTML Markup support
 Assorted routines for HTML markup to a target buffer modified by session
 configuration. Goal should be to move all non-trivial HTML markup here!

html_secure: HTML sanity check
  Routines that translate HTML in source buffer into equivalent HTML
  in target buffer removing all dodgy constructs and tags. Experimental.

string: String (i.e: char *) manipulation routines
  A range of routines for splitting up strings into their component
  sections and putting them back together again afterwards. Poors mans
  regexps, but probably rather faster.

draft: Message draft
 A Draft message in its own pool, plus any attachments as separate entities.

speller: Spelling check engine
 Low level interface to spell check engine. Hasn't changed in a long time,
 should probably be reviewed.

msgmap: Message listing with sort and zoommap filters applied.
  Cache for current sorted+zoomed message view. Has special handling for
  default case (sort on arrival, no zoom applied) so that client routines
  can use zoommap as a generic access to the folder listing without having
  to worry about details of any underlying sort or zoom. Important that
  clients call zoommap_invalidate if they change the underlying folder
  listing and zoommap_associate when the current folder changes.

cdb: Constant database
  Interface to Qmail style cdb files. Fast lookups for static data.

options: User defined data
  Container class for user defined state:
    prefs, abook, dictionary, roles, favourites

prefs: Preferences library
  Prayer user preferences data structure and a few support routines.

abook: Addressbook Abstaction
  Routines and data structures for manipulating personal addressbook.

dictionary: Dictionary Abstaction
  Routines and data structures for manipulating personal dictionary.

role: Role Abstaction
  Routines and data structures for manipulating roles.

favourite: Favourite mail folders
  Support class for favourite folders list.

postponed: Postponed messages list
  Routines for manipulation postponed-msgs list.

rfc1522: RFC822 header decoding
  Routines for decoding QP encoding in message header, adapted/stolen from
  Pine code. Should really be called rfc2XXX today...

banner: Icon banner
  Support Class for manipulating list of icons that appear as a group.

account: Account management
  Support Class for Account management via accountd server

filter: Mail filtering
  Support Class for mail filtering (including mail redirection and vacation).

wrap: Linewrap algorithmn
  Adaptive and hopefully quite intellegent routine for line wrapping large
  blocks of text as a series of small identifiable chunks. We still have to
  see how well things work in practice of course!

portlist: List of HTTP ports
  Class used by session_inet() to manipulating long lists of active and
  idle HTTP ports. Experimental at the time of writing.

session_config: Session server specific configuration
  Sets up parts of config structure which are specific to login sessions.

session_exchange:
  Response for processing a single HTTP request that will update the state
  of this login session. Used by both session_unix() and session_inet()

session_unix:
  Runs login session proxying all incoming requests via a Unix domain socket

session_inet:
  Runs login session with HTTP requests coming direct to Internet domain
  socket.

session_idle:
  Support routines for session_inet. Deal with HTTP requests to a Internet
  domain socket whose session has disconnected or timed out.

session_server:
  Master server routine for prayer_session. Accepts login requests from
  frontend server and forks off a separate domain to validate the login
  request and run session_unix() or session_inet() as appropriate.


session_main: main() function for session server
  Reads in configuration file from one of three places. Overrides
  configuration options from command line.  Checks configuration. Binds to
  specified HTTP ports. If running as root: lose root privileges and
  re-exec self with "--ports" command line option to prevent core dump
  paranoia (goto line 1!). Starts frontend_server()

Command modules: 
================

cmd.c dispatches command request sent to a logged in prayer session. Each
cmd_XXX module corresponds to a single part of the interface. Examples:

 /session/dpc22//compose     -->  cmd_compose()
 /session/dpc22//abook_list  -->  cmd_abook_list()

Hopefully this should be fairly easy to find the relevant piece of code
from the session URLs. The one caveat is that Prayer can be configured to
use transparent page substutition to optimise out HTTP level redirects
which would otherwise be sent to the user agent. For example there is a
"POST" form on the initial welcome screen. When you press the submit button
the user interface will redirect you to another screen, typically
cmd_user_level() or cmd_list(). However the transparent subsitution means
that the URL that appears at the top of the users screen will still read
"welcome". The solution is to disable page substitution, either in the
global configuration file, a specific user preferences or by using the
debug screen to override the default setting.
