
The RAD package is a simple package for Reverse Automatic Differentiation.
RAD is specialized for function and gradient computations: computing a
scalar function and its gradient vector (of first partial derivatives).

This directory contains source for two forms of the RAD package: a
basic, non-templated form and a templated variant.  For the basic
form, one must include rad.h and link with auxiliary routines whose
source is in file radops.c.  For the templated form, one includes
trad.h rather than rad.h.  Type ADvar in the basic form corresponds to
ADvar<double> in the templated form.  (See "TEMPLATED RAD PACKAGE"
below for more discussion of the templated form.)

In short, with the basic package, one declares variables of type ADvar,
initializes independent ADvar variables with arbitrary scalar (e.g.,
double) values, and computes other ADvar values from them; the last
one computed is the objective f, partials of which with respect to
the other ADvar variables are computed when one invokes

	ADcontext::Gradcomp();

If v is of type ADvar, the partial of f with respect to v is then
available as v.adj().  The current value of v in general is available
as v.val().  Both the .val() and .adj() values remain available until
the next time a value is assigned to an ADvar variable, at which time
the memory used for such values is recycled.  With the templated form,
the above call would become

	ADcontext<double>::Gradcomp();

An overview of and motivation for the RAD package appear in
"Semiautomatic Differentiation for Efficient Gradient Computations"
by David M. Gay; see http://endo.sandia.gov/~dmgay for a pointer to
this paper.  (The paper deals only with the basic form; the templated
form may take longer to compile, but should give the same run times.
Changes to the basic package since the above paper was written may
result in somewhat better run times for some computations.)

This directory also contains several source files that illustrate
use of both forms of the RAD package.  File demo.c provides a simple
example of using the basic package; comments in demo.c explain
what it does.  Files demo1.c and demo2.c are variants of demo.c
that illustrate two optional ways to keep track of independent ADvar
values, which has been useful in at least one context.  For more
discussion, see "INDEPENDENT VARIABLES" below.

Sometimes it appears useful to have "constant" ADvar values.  (In the
motivating application, all variables are given initial values, and
some are later determined to be "constant".)  If one knows in advance
that some values will be constant, yet wants their sensitivities
(i.e., .adj() values), the variables in question can be given type
ConstADvar (or ConstADvar<type> in the templated form).
Alternatively, an ADvar variable can be passed to AD_Const() to have
it treated as a constant.  Another alternative is to invoke

	ADvar::set_fpval_implies_const(true);

and then assign numeric (i.e., non-ADvar) values to ADvar variables
that are to be treated as constants, and to invoke

	ADvar::set_fpval_implies_const(false);

when subsequent assignments (e.g., to independent ADvar variables)
of numeric values are not to be regarded as constant assignments.
In any event, "constant" variables retain their old values once
computations begin anew after a call on ADcontext::Gradcomp().  If the
.adj() values of "constant" ADvar variables are of interest, one must
invoke

	ConstADvari::aval_reset();

before starting the new computations begin.  File cdemo.c
illustrates the basic form of these facilities, and file
tcdemo.c is a variant of cdemo.c that uses the templated form.

Yet another way to deal with "constant" ADvar values is to compile
with -DRAD_AUTO_AD_Const, i.e., with RAD_AUTO_AD_Const #defined.
At the cost of extra time and space, this causes all ADvar and
IndepADvar variables to retain their last values when computations
begin anew after ADcontext::Gradcomp().  Memory for intermediate
expressions is still recycled when new computations begin.

Functions that return ADvar values simply work.  Sometimes it
is convenient to use functions provided by third parties, i.e.,
functions whose source cannot be adjusted to return ADvar.
In this case, one can invoke ADf1, ADf2, or ADfn to obtain
an ADvar value that will participate correctly in gradient
computations.  ADf1 is for a function of a single variable,
ADf2 for a function of two variables, and ADfn for a function
of arbitrarily many variables.  One must provide the necessary
partials to the ADf* functions.  File udfdemo.c illustrates use
of ADf1, ADf2, and ADfn, and file tudfdemo.c is a corresponding
templated illustration.

By default, if v and w are of type ADvar, the assignment "v = w;"
aliases v and w in that if no further assignments to v or w occur,
then after "ADcontext::Gradcomp();", v.adj() and w.adj() will be
the same.  To make v.adj() and w.adj() reflect references just to
v and w, respectively, change the assignment to "v = copy(w);".
Alternatively, to make all simple assignments "v = w;" behave as
though written "v = copy(w);", compile with -DRAD_NO_EQ_ALIAS (i.e.,
with RAD_NO_EQ_ALIAS #defined).  One can select the aliasing behavior
file by file, by compiling with or without -DRAD_NO_EQ_ALIAS.

Compiling with -DRAD_EQ_ALIAS can considerably reduce the storage
and time taken at run time, at the "cost" of aliasing variables
that have the same values due to assignment statements.  For example,
with -DRAD_EQ_ALIAS,

	x = // something complicated
	y = x;
	f = g(x) + h(y);

would be treated the same as

	x = // something complicated
	f = g(x) + h(x)

as far as partials w.r.t. x are concerned, and both x and y would
be reported to have the same adjoints.  (In the default case
where -DRAD_EQ_ALIAS is not specified, x and y would in general
wind up with different adjoint values.)  Compiling with
-DRAD_AUTO_AD_Const causes -DRAD_EQ_ALIAS to be ignored.

The basic RAD package provides a way, on systems that use IEEE
arithmetic, to find ADvar values that are used without being
reinitialized after a call on ADcontext::Gradcomp():  simply link with
a suitably compiled version of radops.c and with uninit.o from libf2c.
(An IA32 Linux version of uninit.o appears in this directory.)
Source file radopsdebug.c has a #define that causes about 8 megabytes
of scratch space to be used sequentially for computations with ADvar
values, with the old space set to signaling NaN values when new
computations begin.  References to unreinitialized ADvar variables
then cause a fault.  File debugdemo.c, compiled with "make debugdemo",
illustrates this facility.  For more discussion, see "DIAGNOSING
UNINITIALIZED ADVAR VARIABLES" below.


DIAGNOSING UNINITIALIZED ADVAR VARIABLES

For warnings (at the cost of extra memory) of references to
ADVar values referenced but not initialized after
	ADcontext::Gradcomp();
link with radopsdebug.o and uninit.o rather than with radops.o.
This will cause the old temporary memory to be set to signaling NaN
values at the first assignment of a new ADvar value and should adjust
the floating-point state so that you will get a fault if you reference
an uninitialized ADvar value.  If you compile with -g and run under a
debugger, it should be to find where the incorrect reference happens.

Try
	make debugdemo
	./debugdemo

and, e.g.,

	gdb debugdemo
	run


The setting of signaling NaNs is by the "_uninit_f2c" routine
in uninit.o from libf2c.  (A Linux uninit.o is included herewith.)


TEMPLATED RAD PACKAGE

File Sacado_trad.h combines Sacado_rad.h and Sacado_radops.c int a
templated version of the RAD package.  It permits use of such types as

	ADvar<double>
	ADvar<float>
	ADvar< complex<double> >
	ADvar< complex<float> >

The latter two require a previous "include <complex>".  File tdemo.c is a
variant of demo.c that uses trad.h.  The templated version also declares

	IndepADvar<T>
	ConstADvar<T>

The former are recorded as "independent" ADvar<T> variables (i.e., ADvar
variables based on type T), as above.

The templated package does not presently permit mixing types,
such as ADvar<double> and ADvar<complex<double> > or ADvar<float>.

To diagnose ADvar variables that are referenced but not reinitialized
after a call on ADcontext<type>::Gradcomp(), link with uninit.o and
compile with, say,

	-DRAD_DEBUG_BLOCKKEEP=2000

The number assigned to RAD_DEBUG_BLOCKKEEP (2000 in the above example)
is the number of memory blocks (about 8 kilobytes each) that will be
filled before any memory blocks are recycled.  At the first new
allocation after a Gradcomp() invocation, the old memory blocks are
filled with signaling NaN values.

Compiling with -DRAD_DEBUG inserts code that numbers each ADvari
allocation and permits setting a breakpoint at a particular allocation.
It also arranges for file rad_debug.out (or another name give in shell
variable RAD_DEBUG_FILE -- none if RAD_DEBUG_FILE is an empty string)
to be written at the first Gradcomp() invocation.  See trad.h for
more details.

With the templated RAD packages, two alternatives to
-DRAD_AUTO_AD_Const are compilation with -DRAD_REINIT=1 and
compilation with -DRAD_REINIT=2.  Both cause use of alternative
(distinct) schemes for retaining values after storage is reclaimed
when a new computation begins after Gradcomp or Weighted_Gradcomp.
RAD_REINIT=1 takes more memory and in general requires one to call
aval_reset() before starting a new computation.  (Most of the time
this is not necessary, but it is needed when certain expressions
appear at the start of the computation.) RAD_REINIT=2 involves an
extra test at each reference to and ADvar value (and involves less
memory than RAD_REINIT=1).  On a simple bilinear form evaluation
(vector times known matrix times vector), for small computations that
probably stay in cache, the new schemes take perhaps 2 or 2.5 time as
long as the older ones; on larger computations, the relative time
difference is much smaller.  The relative times for RAD_REINIT=1
versus RAD_REINIT=2 depend on the computation.  With recent versions
of g++, RAD_INIT=1 and RAD_INIT=2 elicit warnings about breaking
strict-aliasing rules; simply ignore these warnings, which in this
case are harmless.  The rules of C++ make it hard to banish these
warnings.

Unless RAD_DISALLOW_WANTDERIV is #defined, the RAD_INIT schemes just
described introduce a new type, ADvar_nd, for ADvar values whose
partials are not of interest.  (When RAD_DISALLOW_WANTDERIV is
#defined, ADvar_nd is the same as ADvar.) No derivative propagations
happen for pieces of the computation that only involve ADvar_nd values
and constants.  If x is an ADvar or an ADvar_nd, invoking
x.Wantderiv(1) indicates that partials w.r.t. x are of interest, and
invoking x.Wantderiv(0) indicates that they are not of interest;
x.Wantderiv() returns the current setting (0 or 1).

Another feature of the templated RAD package is a way to find
variables that should be passed to AD_Const or be declared to have
type ConstADvar.  Compiling with -DRAD_Const_WARN implies -DRAD_DEBUG
and -DRAD_AUTO_AD_Const and inserts extra state and tests, so that use
of a variable which should have been passed to AD_Const (or declared
ConstADvar) causes a call on
	RAD_Const_Warn(void *v)
with *(void**)v a pointer to the variable in question.  The idea is to
run under a debugger with a breakpoint set in RAD_Const_Warn.  To
simplify use with a debugger, RAD_Const_Warn is a non-template
function with "C" linkage.  An instance of this function that does
nothing (but can be used for break-point setting) will appear in
libsacado, but can be replaced at will.  By printing *(void**)v at a
breakpoint in RAD_Const_Warn, then looking at the addresses of
variables in the frame in which the operation calling RAD_Const_Warn
appears, one can find the offending variable.


USING TYPEDEFS

After declaring, for example,

	typedef ADvar real;

the the basic package or

	typedef ADvar<Double> real;

with the templated package (for some suitable type Double, and after
#including rad.h or trad.h), one can declare variables of type real
and can invoke

	real::Gradcomp();
or
	real::aval_reset();


COMPUTING JACOBIAN-TRANSPOSE TIMES W FOR SEVERAL W VECTORS

Suppose we have a typedef for "real" in effect as above and have
available

	int n;
	real *v[n];
	Double w[n];

and suppose we have computed something into the real values *v[i] for
0 <= i < n and have assigned suitable weights to the corresponding w[i].
Then invoking

	real::Weighted_Gradcomp(n, v, w);

propagates adjoint values as though we had just said

	ADvar f;

	f = 0;
	for(i = 0; i < n; i++)
		f += w[i] * *v[i];

	real::Gradcomp();

One can make several calls on real::Weighted_Gradcomp() with different
choices of n, v, and w, extracting .adj() values between calls, before
beginning a new computation, at which time memory will be recycled as
usual.  Invoking

	real::Gradcomp();

has the same effect as

	ADvar f, *v1;
	Double w1;
	// ...
	f = ... ;
	v1 = &f;
	w1 = 1.;
	real::Weighted_Gradcomp(1, &v1, &w1);

Weighted_Gradcomp has signature

 void ADcontext::Weighted_Gradcomp(int n, ADvar **v, double *w);

in the basic package and

 void ADcontext<Double>::Weighted_Gradcomp(int n, ADvar<Double> **v, Double *w);

in the templated package; the header files include inline functions to permit
invoking ADvar::Gradcomp(), ADvar::Weighted_Gradcomp(), ADvar::aval_reset()
(and similarly for IndepADvar in place of ADvar).

David M. Gay (dmgay@sandia.gov)
Sandia National Laboratories
Optimization and Uncertainty Estimation Department
P.O. Box 5800,  MS 1318
Albuquerque, NM 87185-1318
U.S.A.
