SATAN data elements

| GSI | Biophysics | SATAN long write-up |

SATAN data elements

SATAN is designed to handle various types of data:

Raw data (listmode, singles)
analyzers
calibrations
linearizations
global parameters

To begin with the origin of data, let's consider a simple telescope experiment:

A scattered particle generates three signals which are digitized and form a set of correlated parameters (an event). These data can be saved on tape or disk and are fed into an analysis program which calculates derived quantities and accumulates spectra.

And here's the flow of data generated with an experiment like the one above:

(Larger)

Data flow in SATAN. The data may be fed in from an external device (disk, tape, online) and are distributed by the dispatcher to several analysis sections. The output may consist of modified raw data or spectra. All these processes are controlled interactively via commands.

The various data types will be explained in the following.

Raw data (listmode, singles)

SATAN can handle various types of raw experimental data in binary format: EDAS as well as GOOSY (SBS,MBS) data. Both are processed using the INPUT command, unpacked and dispatched during read and passed to the analysis routine.
Both, EDAS as well as GOOSY data are identified by a data type and a data subtype

List mode data consist of sequences of events. An event represents the set of numerical values associated with a physical event; it is a collection of all parameters measured in connection with the physical event.
An event consists of several parameters correlated in time; each parameter is a numerical value resulting from the conversion of the analog signal delivered by a detector. The parameters may represent physical coincidences (e.g. two detectors at different angles looking at coincident particles) or just different attributes of a single physical event (energy loss and residual energy of particles in the telescope experiment for example). To keep track of the correlations between the different parameters for later analysis, they are normally written event-by-event onto a storage device.
Listmode data always are passed to the user analysis.

Singles data are uncorrelated data. They are incremented in the data taking hardware without storing the data flow, thus forgetting the history of the events. When singles data are found in the input stream they are processed automatically by the system (not passed to the user analysis).

EDAS singles data

Mostly spectra accumulated in the hardware of the experiment computer as DMI (Direct Memory Increment) or CMI (CAMAC Memory Increment) spectra. Spectra dumped in the EDAS subsystem GOLDA with the command ADUMP either to local magnetic tape or via the fast data link to the former IBM mainframe are controlled in SATAN by the commands IPDP and IPRESET. By default such spectra are skipped when found in the input data stream. The user may, however, specify that all or a selected subset of them should automatically be converted to analyzers of special event names. Additionally the SATAN command IPDUMP provides for a fast standard method of transferring GOLDA spectra from (converted) tapes or disk files directly to the VSAM analyzer library or to Nuclear Data format data sets. All or a selected number of files may be processed. The converted spectra are then available to other analysis programs (e.g. written in FORTRAN), using the VSAM interface procedure $FVAN or reading directly the data sets.

EDAS listmode data

These are legacy data taken with GSI's former data acquisition system in the 1970's and 1980's. The data buffers have a fixed length of 4kB. Currently supported are only buffers of type 3. EDAS binary data are always in big-endian byte order, character data are encoded in EBCDIC rather than ASCII. The SATAN I/O routines perform all necessary data conversions (EBCDIC to ASCII and byte-swap on little endian machines)

GOOSY/SBS/MBS listmode data

These are data taken with GSI's newer acquisition systems based on J11, CAMAC or VME frontend processors. The data are organized in (GOOSY) buffers of various types. Currently recognized buffer types are:

2000: file header
4: events of type,subtype 4,1 and 4,2
6: MBD events of type,subtype 6,1
10: VME events of type 10,1 (10,2 and 10,3 not yet)

In GOOSY character data normally are encoded in ASCII, whereas binary data can be either little-endian or big-endian, depending on where the data originally was written. An exception are LMDI-data, these are GOOSY data which have been converted for use on GSI's former IBM MVS mainframe computer. They come with character data in EBCDIC and binary data in big-endian.
No matter which kind of data encoding, the SATAN I/O routines handle all event unpacking and byte-swapping transparently to the user's analysis program. The user analysis always receives a full event ready to access.

Analyzers

An analyzer is the software counterpart of a hardware multichannel analyzer (MCA, used to take singles experiment data). It is a complex data object comprising

a spectrum (histogram) of one to four dimensions, with up to 32k bins for each dimension. The spectrum contents may be two- or four-byte integers, or floating point numbers.
the corresponding error spectra, that is, up to four additional spectra of the same size and dimensionality holding the errors (uncertainties): the upper and lower vertical and the left and right horizontal errors. If only upper vertical (lower vertical) error data are stored, the lower vertical (upper vertical) error data are assumed to be identical. The same holds for left and right horizontal error data, respectively.
conditions with associated flags,
display windows and display points, and
general information and comments.

An analyzer's main purpose is to accumulate listmode event parameters in its associated spectrum.
The conditions define a spectrum segment for usage in analysis programs and with SATAN commands. They can be associated with an interval for each dimension, or, in the two-dimensional case, with a closed polygon of any shape (free form condition).
If not inhibited by the user, each listmode event parameter to be accumulated is checked if lying within the condition limits of the analyzer, and the corresponding flags are set. The flag values may be used to change the data flow in the analysis program.
The Display Windows and Points serve to store and display spectrum segments and channel numbers for (interactive) usage with commands, but they are not accessed in analysis programs.

Analyzer naming convention

Analyzers are identified by two names, the analyzer name and the qualifier name. A valid name consists of a sequence of one to 18 alphanumeric characters, which may be letters, digits, or the special characters '$', '@', '_', and '#'. The first character of a name must be a letter, '$' or '@'. The character '#' acts as a dimension separator used in display headings of multidimensional analyzers, for example.

The qualifier name of analyzers depends on how it was created:

LIST if it was created in SATAN analysis programs (with the macro $AGEN()),
GLIST if it was created from listmode data by the GOLDA subsystem of the former EDAS system (GOLDA event type = 1),
DMI if it was created from singles data (DMI or CMI spectrum) by the GOLDA subsystem of the former EDAS system (GOLDA event type > 1).
arbitrary if it was created dynamically by the user (command AGEN).

Thus it is possible to have two analyzers with the name ENERGY, e.g. one with the qualifier LIST, and the other one with DMI.

Internally SATAN assigns a unique integer, incremented in the order of creation, to each analyzer and qualifier name. These numbers are used by most builtin commands and are also available for the users. From the names, the numbers may be obtained with the procedure $ATRACE, or, at command level, with the commands AATT and ALIST.

Analyzer Creation

Analyzers may be created statically within the analysis program with the macro $AGEN() or dynamically on the command line level with the command AGEN. For example the macro

                                                                            
$AGEN(REL_ENERGY)

and the command

                                                                            
AGEN REL_ENERGY LIST

both create an analyzer with the name REL_ENERGY and the qualifier LIST. They are created with the default attributes

spectrum type 4 (4 byte integers)
one dimension,
limits 1 to 1024,
bin size 1,
no conditions,
no display windows, and
no error data.

Analyzers created dynamically on the command line level may be deleted (command ADES), and all attributes can be modified (command AMOD). In contrast, analyzers created statically with the macro $AGEN() cannot be deleted, and some attributes (type, dimension, and qualifier name) cannot be changed in a session. This is because the analysis program runs like a separate thread and vital analyzer information must not be changed while the analysis is in progress.

It is possible in analysis programs to create analyzer arrays:

                                                                            
$AGEN( mass(3) );

creates analyzers mass(1), mass(2), mass(3), respectively, with identical attributes.

Analyzer dimensionality and limits

Analyzers may have up to four dimensions. Thus a multidimensional analyzer can handle several input values simultaneously:

                                                                           
   $AGEN (DE#E) LIMITS(0,10,1,191);

creates a two-dimensional analyzer DE#E. The first dimension has 11 channels varying between 0 and 10, the second dimension has 191 channels varying between 1 and 191.

Analyzer bin size

To reduce the size of a spectrum, more than one channel number in any dimension may be defined to belong to one spectrum element (bin). The number of adjacent channels of one dimension contributing to a spectrum element is called bin size.

$AGEN (CHR) BINS(16) LIMITS(0,511);

In this example, each 16 channels of analyzer CHR will be summed up to form one bin. The spectrum is reduced from 512 (for bin size 1) to 32 numbers. This may be useful to improve statistics, and it will save storage.

Note:

All macros, commands, and displays use the original channel numbers of the incoming data, regardless of the analyzer bin size.

Analyzer Conditions

A condition in the simplest case is a set of channel number limits with one pair - the lower and upper limit - for each dimension. Conditions may be set by the the commands ACDEF and DSWINDOW, or by the macro $ACDEF() in an analysis program:

                                                                         
 $AGEN (REL_ENERGY) LIM(0,4095) NCND(2);                                     
 $ACDEF (REL_ENERGY,1,15,360);                                               
 $ACDEF (REL_ENERGY,2,315,512);

The analyzer REL_ENERGY is created with two conditions. The subsequent invocations of $ACDEF() set the condition limits to the values 15 and 360 for condition number 1, and to 315 and 512 for condition number 2.

When accumulating with the macro $ANAL(), an associated flag is set by the system, if the analyzer input values for an event lie inside the condition limits. This flag can be checked in subsequent program sections using the macro $AC().

Conditions are numbered sequentially starting from 1. All conditions must lie within the analyzer limits.

Free Form Conditions

The concept of the simple rectangular condition has been extended to a more general one allowing any shape of the condition in the case of a two-dimensional analyzer. The free form condition limits are defined by a closed polygon in the x-y coordinate plane.

With the command ACDEF the coordinates of the polygon edges can be entered directly or via global parameters, and existing free form conditions can be copied from other analyzers. With the command D2COND, free form conditions can be created and edited using a graphic cursor.

Analyzer types

The analyzer type specifies what kind of spectrum is attached to the analyzer.

Type 0 analyzers have no spectrum. Their main purpose is condition checking. They are also useful for live display output.

You can choose between 2-byte (type 2) and 4-byte (type 4) integers, and 4-byte floating point spectra (type 24).

Analyzer Channels and Bins

Analysis programs accumulate data in analyzer spectra. The spectrum elements are called channels or bins. For analyzers with dimension d, a channel of the spectrum is specified by d numbers.

One-dimensional analyzers, created with default attributes, have 1024 channels or bins:

                                                                            
 1, 2, ..., 1024.

The lower limit defines the first channel, and the upper limit the last channel. Corresponding to MCA hardware, the range of channel n is the semi-closed interval

                                                                            
[n,n+1).

The left border of channel n has the value n and belongs to channel n, whereas the right border n+1 belongs not to channel n, but to the next channel n+1.

If the analyzer bin size has a value > 1, several channels are merged to form a bin. For example, the analyzer mentioned above has only 256 bins with bin size 4. In general, for an analyzer with lower limit L, the lower and upper border of bin n have the following values:

                                                                         
bin size  lower border  upper border                                             
    1      n-1   +L      n  +L                                                          
    B     (n-1)*B+L      n*B+L

The bin mean is the arithmetic mean of the lower and upper bin border. Again the left bin border belongs to the bin, whereas the right bin border does not.

With the calibration data element, a tool is provided to assign any value to bin means or borders. This is especially useful for display purposes to calibrate display axes.

In SATAN notation, the channel numbers are always associated with a bin size 1, and bins and channels are only identical for analyzers with bin size 1. Remember that the original channel numbers of the incoming data are used by all macros, commands, and displays, regardless of the analyzer bin size.

Analyzer storage and retrieval

Analyzers are stored in native format in data element libraries. These libraries are regular files, with an internal structure according to the Gnu Data Base Management system. Analyzer data written into the libraries are associated with a unique key. This technique resembles the VSAM datasets used in the former MVS-based SATAN system and allows an easy conversion of both old software and old data.
When analyzers are written to a library they can be associated with a run identifier. The process of writing analyzers is called a dump. For each write action the dump number is incremented. Thus multiple generations of the same analyzers fo the same run can be held in a single data element library. When analyzers are retrieved from a data element library the run identifier as well as the dump number has to be specified in addition to the analyzer name and qualifier (event identifier).
You use the command ADUMP to store analyzers and AFETCH to retrieve them.

Analyzer export

At present the only way to export analyzers is in gd format using the AEXPORT command.

Analyzer import

At present the only way to import analyzers is from the old MVS-based VSAM libraries. Please consult the section on MVS legacies for details.

Analyzer Handling Functions

Several macros, functions, analyzer commands and display commands are available to the user for various analyzer operations.

Calibrations

The calibration data element provides one-dimensional arrays of real numbers, which may be used as calibrated coordinates. By command references can be created between analyzer spectra and calibrations. This enables, for example, the display of analyzer spectra with calibrated axes, or the usage of calibrated limits. To calibrate all coordinate axes of a n-dimensional spectrum, a reference to a calibration is needed for each dimension of the spectrum.
For the same calibration, references may exist to several different analyzer spectra, or even to different dimensions of the same (multi-dimensional) analyzer spectrum.
References between analyzers and calibrations can be created and deleted dynamically by SATAN commands. However, references are not stored in the data element libraries. With the end of a SATAN session, all references are lost and must be recreated when starting a new SATAN session.
The calibration values may be defined by a polynomial (polynomial calibration), or they can be specified explicitly by the user. The calibration data element consists of:

a one-dimensional array containing the calibrated coordinates or - in case of a polynomial calibration - the coefficients of the polynomial,
limits, expressed in terms of (virtual) analyzer channel numbers, which specify the validity range, and
an axis description for display.

Calibration storage and retrieval

Calibrations are stored and retrieved the same way as analyzers. Please consult the section on Analyzer storage and retrieval for details.

Linearizations

The linearization data element may be used for the evaluation of two-dimensional spectra, which are a function of two correlated parameters. Performing the linearization process, the functional dependance of one of the two parameters will be removed by processing a given linearization prescription, which is stored in the corresponding linearization data structure. For example, assume the following functional dependance:

charge = f(total energy, energy loss),

as in the case of charge separation in ionization chambers. The result of a linearization process is the charge as a function of only one of the parameters, for instance

                                                                            
charge = f(energy loss).

The data element linearisation comprises:

a gate, implemented as a free form (banana) window, specifying the working area, and
several polygon lines containing the information for the removal of the functional dependance of one parameter (linearization prescription).

Gate and linearization prescription can be created, modified, or deleted interactively with SATAN commands specifying graphics or alphanumeric input. The linearization prescription may be applied to an analyzer spectrum by command or in analysis programs. A more detailed description of the linearization procedure can be found here.

Linearization storage and retrieval

Linearizations are stored and retrieved the same way as analyzers. Please consult the section on Analyzer storage and retrieval for details.

Global parameters

In most analysis programs calibrations or other numerical calculations are to be performed with the raw data. This implies the use of numerical constants, the exact values of which are mostly unknown at the time the program is written. A solution to this problem are global parameters. Global parameters may be used to control data flow and execution logic in analysis programs.

A specific global parameter is unique in a SATAN session and known in all user programs (analysis (sub)routines, user command procedures), in which it is declared with the macro $PARDCL(). On the command level a global parameter may be declared with the command IPARDCL.

With the command IPAR, global parameters can be accessed, modified and listed. Most contents of all data elements can be stored into global parameters using the command ASTORE, and several other commands provide output into global parameter arrays. Because global parameters can be read from and stored into plain text files, they are an important interface between SATAN and other programs running independent of the SATAN environment.

For more details refer to the description of global parameters.

| GSI | Biophysics | SATAN long write-up |

Last updated: M.Kraemer@gsi.de, 2-Aug-1999

Impressum Data privacy protection