**********************************************************************
WISDOM about computing attributes: what, when and why
**********************************************************************

Date: Wed, 14 Jul 2004 12:32:00 -0500
From: Derek Wright <wright@chopin.cs.wisc.edu>

On Wed, 14 Jul 2004 03:48:15 -0500 (CDT)  Jeffrey Freschl wrote:

> When a Startd receives a direct query (e.g., condor-status -direct
> balthazar) it first generates a new ClassAd that will contain all of
> its attributes.  I'm confused as to whether it re-calculates values for
> all of its attributes, or if it only re-calculates stale attributes?

it recomputes everything that changes dynamically.  e.g. it finds out
the current load average and idle time, but it doesn't recompute how
much physical RAM is in the machine.  "stale" is sort of hard to
define in this case, but effectively, yes, it recomputes everything
that's "stale".

> This line of code is what is confusing me (located in ResMgr.C
> ResMgr::makeAdList):
> 
>                 // Make sure everything is current
>         compute( A_TIMEOUT | A_UPDATE );

if you don't care about the internals of the startd, stop reading
now... ;)

granted, this stuff is confusing.  basically, most objects within the
startd support a compute() method to compute values for whatever stuff
that object is keeping track of, and a publish() method that publishes
values into a given classad.  both take this bitmask (amask_t, short
for "attribute mask type") to specify exactly what kind of work they
should do (what attributes we care about).

the problem is that the kind of info the startd keeps track of
(especially in an SMP) is very complicated.  we don't want to
recompute everything all the time, or it'd get too expensive.  so,
part of the bitmask specifies what "frequency" we're computing for,
and part of it can specify special kinds of attributes we have to
handle differently.

in terms of the frequency, some things can change all the time, and
are often used in policy expressions.  those things need to be
recomputed at every *timeout*.  this includes keyboard activity, load
average, etc.  some things don't change very often, and aren't usually
used in policy expressions, so we only recompute them when we're going
to publish a new *update* to the outside world (e.g. free swap and
disk space on the machine).  some things are *static*, and never
change, so we only have to compute them once (though we recompute on
reconfig, in case the configuration changes in a way that effects
them), for example, RAM, ARCH, OPSYS, etc.  terms i enclosed in '*'
are the names used for the bits (check out ResAttributes.h).

in terms of special types of attributes, some things are *shared*
attributes which are system-wide values that are shared among all the
slots (RAM is a good example... first you have to figure out the total
RAM on the machine, then you divide it up for each slot according to
the configuration).  some attributes are *summed* attributes, which
are computed individually for each slot, and then the total is
computed by adding up all the values (e.g. CondorLoadAverage).  some
attributes are *evaluated*, which means they're just classad
expressions that are evaluated for each slot.  it's actually a
completely insane process to get all of this right for an SMP.  the
interested reader can check out:

ResMgr::compute()  [ResMgr.C, line 1207]

it should be slightly better commented than it is, particularly the
first real code block involving lots of bitmask manipulations. ;)
but, hopefully it's somewhat clear.  enjoy.

-d


