SAS Tips: Introduction to arrays

SAS arrays are useful when we wish to perform a similar operation on a set of variables. For example:

array weight wt1-wt50;

do i=1 to 50;
if weight{i}=999 then weight{i}=.;
end;

A SAS array is nothing more than a collection of variables (of the same type), in which each variable can be identified by referring to the array and, by means of an index, to the location of the variable within the array.

SAS arrays are defined using the ARRAY statement, and are only valid within the data step in which they are defined.

The syntax for the array statement is:

ARRAY array-name {subscript} <$> < length > 
<< array-elements > <( initial-values )>>

array-name must follow the naming rules for SAS variables.

{ subscript } is the dimension (possibly multiple) of the array, and can be omitted or specified as {*} in which case SAS infers the dimension from the number of array elements.

< array-elements > is the list of array elements (variables) which can be omitted if the dimension is given, in which case SAS creates variables called array-name1 to array-name{n} where {n} is the dimension of the array. For example:
array wt {50};
will cause the variables wt1-wt50 to be created.

Example using data from the Swedish Fertility Registry
For each record, we have information on up to 12 'events'. The event type (usually a birth) is stored in the variables type1-type12 and the corresponding date is stored in the variables date1-date12.

The coding for the 'type' variables is:
0=stillbirth
1=live boy
2=live girl
6=immigration

For each woman, we want to count the total number of live births, the total number of completed pregnancies (live births plus still births), and extract the emigration date for the women who emigrated.

array type type1-type12;
array datum date1-date12;
births=0; fullterm=0; emigrate=0;
do i = 1 to 12;
if type[i] in (1,2) then births=sum(births,1);
if type[i] in (0,1,2) then fullterm=sum(fullterm,1);
if type[i] in (6) then do;
    emigrate=1; 
    emi_date=input(datum[i],yymmdd.);
    end;
end;
label
births='No. live births'
fullterm='No. completed pregnancies'
emigrate='Emigration indicator'
emi_date='Date of emigration (SAS date)'
;

References
Arrays: SAS Language, version 6, first edition, pages 292-306.

Index

Paul Dickman
Paul Dickman
Professor of Biostatistics

Biostatistician working with register-based cancer epidemiology.