So far we have declared variables one at a time. We will often need many variables in a program, and "array"s are a technique for declaring many variables of the same type in one statement, and for being able to consider the whole collection as a single object for appropriate operations.
A "structure" is a means for combining several related items which may be of different types as a single object, while still being able to refer to the separate individual components if we wish to.
We are analysing the annual rainfall over a period of 90 years; we require 90 float variables to store the information.
We are studying a piece of English text; we require 10000 char variables to store the characters.
We are recording the numbers of students in each department of the university; we need (depending on how many departments there are) 100 int variables.
The above examples would be declared as
/* 90 float variables */ float annual_rain[ 90 ]; /* 10000 char variables */ char text[ 10000 ]; /* 100 int variables */ int stud_nos[ 100 ];
The number of elements requested in the declaration must be fixed at compile time.
To get at a particular variable (element) in an array, we write the identifier of the array followed by the subscript in square brackets. In C if we declare
float annual_rain[ 90 ];
then the subscripts run from 0 (zero) to 89 inclusive; this gives us the required number (90) of variables.
To access the 23rd of these variables, we would put the subscript in square brackets after the array identifier, and use
... annual_rain[ 22 ] ...
The subscript can be any integer expression.
To store a value in this variable we may use
annual_rain[ 22 ] = ....;
or perhaps
scanf( "%f", &annual_rain[ 22 ] );
and to use the value stored there
if ( annual_rain[ 22 ] > minimum ) { total = total + annual_rain[ 22 ]; }
The real use of arrays is when we refer to each element of the array not by a specific constant subscript, but access all elements in turn or choose a particular element dynamically. To access the variables in turn, for example to read 90 values from the data into the 90 locations, we would use
int year; for( year = 0; year < 90; year++ ) { /* Variable "year" goes from 0 to 89 */ scanf( "%f", &annual_rain[ year ] ); } /* for year */
The 90 values would have to appear in the data stream in the correct order, separated by "white space", i.e. spaces, tabs or newlines.
Having read the 90 values in, we may wish to calculate the total and average rainfall over these 90 years. For this we would use
float total = 0; /* declaration to follow "main" */ for( year = 0; year < 90; year++ ) { /* Each array element is a "float" */ total += annual_rain[ year ]; } /* for year */ printf( "Total %f\n", total ); printf( "Average %f\n", total / 90 );
In the above examples, the constant value 90 keeps appearing. We should really take this out as a constant, so that the actual value appears at only one point in the program. The program now becomes as follows.
/* Global constant */ #define duration 90 main() { float annual_rain[ duration ]; int year; float total = 0; for( year = 0; year < duration; year++ ) { scanf( "%f", &annual_rain[ year ] ); } /* for year */ for( year = 0; year < duration; year++ ) { total += annual_rain[ year ]; } /* for year */ printf( "Total %f\n", total ); printf( "Avege %f\n", total / duration ); } /* end main */
We will not now have problems when the number of years duration of the rainfall analysis needs to be changed; all values of the duration will change in step together when we change the value of the global constant. If we had not done this, we might have forgotten to change some of the occurrences of "90" to another value.
Note that the value of the "#define" denoting the number of elements in the array must be a genuine integer constant! When the compiler is digesting your program must know exactly how many array elements are required. The constant must not be one which is determined as the result of some calculation while the program is running.
Note also the typical C loop, starting at zero, and limited by "counter strictly less than number of elements". This gives us the range of values 0, ..., 89 if there are 90 elements. There is NO element with subscript 90.
To count the number of occurrences of the letter 'e' in a piece of text, we will write a program to read the characters of the text into an array of characters, and then search the array for occurrences of the letter 'e'. We will need an array of char elements. We will read characters from the input until we encounter a full stop.
/* Set maximum number of characters */ #define max 100 /* Grab space for 100 characters */ char sentence[ max ]; int i = 0; /* Read up to a full stop */ while( scanf( "%c", &sentence[ i ] ) == 1 && sentence[ i ] != '.' ) { if ( ++i >= max ) { fprintf( stderr, "Error sentence overflow\n" ); exit( 1 ); } } /* while read character not full stop */
We should declare the array long enough to hold all likely sentences, and check that the data does not overflow it.
If we ran the program, and typed
The cat sat on the mat.
at the terminal, the code would set
sentence[ 0 ] | to the value | 'T' |
sentence[ 1 ] | to the value | 'h' |
sentence[ 2 ] | to the value | 'e' |
sentence[ 3 ] | to the value | ' ' |
To search through the array once it has been read in looking for occurrences of the letter 'e', we use
int count = 0; /* Now go through the array counting */ for( i = 0; sentence[ i ] != '.'; i++ ) { if( sentence[ i ] == 'e' ) { count++; } /* end if letter is e */ } /* end search sentence */ printf( "Number of e's is %d\n", count );
We again control the loop by looking for the full stop '.'. It would be possible instead to count the total number of characters as they are read in, and use this count to control the upper limit of the loop.
We wish to read integers into an array (of int variables), sort them into ascending order by swapping, and then print them out. We will assume that the data is terminated by a zero value. First the declarations:
/* the maximum number of ints */ #define max_n 100 int array[ max_n ];
Then we read the data in:
int i = 0; int number; while( scanf( "%d", &array[ i ] ) == 1 && array[ i ] != 0 ) { if ( ++i >= max_n ) { fprintf( stderr, "Error array overflow\n" ); exit( 1 ); } } /* while read up to zero */ /* note how many we read in */ number = i;
Then we sort the numbers into ascending order by swapping:
for( i = 0; i < number; i++ ) { for( j = 0; j < i; j++ ) { if( array[ j ] > array[ i ] ) { /* swap if out of order */ swap = array[ i ]; array[ i ] = array[ j ]; array[ j ] = swap; } /* if out of order */ } /* for j up to i-1 */ } /* for i in the array */
Then we might print the results:
for( i = 0; i < number; i++ ) { printf( "%d %d\n", i, array[ i ] ); } /* for i */
To print the results ten entries per line
/* npl = number per line */ #define npl 10 for( i = 0; i < number; i++ ) { printf( "%d ", array[ i ] ); if ( (i+1) % npl == 0 ) { printf( "\n" ); /* newline */ } } /* for i */ printf( "\n" );
The above program when put together becomes as follows.
#include < stream.h> /* the maximum number of ints */ #define max_n 100 /* npl = number printed per line */ #define npl 10 main() { int array[ max_n ]; int i = 0, number; int j, swap; while( scanf( "%d", &array[ i ] ) == 1 && array[ i ] != 0 ) { if ( ++i >= max_n ) { fprintf( stderr, "Error array overflow\n" ); exit( 0 ); } /* if check array overflow */ } /* while read up to zero */ /* note how many numbers we read in */ number = i;
/* Now order them */ for( i = 0; i < number; i++ ) { for( j = 0; j < i; j++ ) { if( array[ j ] > array[ i ] ) { /* swap if out of order */ swap = array[ i ]; array[ i ] = array[ j ]; array[ j ] = swap; } /* if out of order */ } /* for j up to i-1 */ } /* for i in the array */
/* Now print them */ for( i = 0; i < number; i++ ) { printf( "%d %d\n", i, array[ i ] ); if ( (i+1) % npl == 0 ) { printf( "\n" ); /* newline */ } } /* for i */ printf( "\n" ); } /* end main program */
Always be careful to distinguish between the operation
and
To find the maximum value in an array you may write
float values[100]; /* Assume that the array is set up */ /* with values terminated by a zero. */ int sub; /* Subscript */ float max = values[0]; for ( sub = 1; values[ sub ] >= 0; sub++ ) { if ( max < values[ sub ] ) { max = values[ sub ]; } } /* for sub in array */
We start by assuming that the first element is the largest, and compare each of the remaining elements (subscripts from 1 upwards) with it in turn.
We now have the largest value in the "float" variable "max". The type of the variable "max" will be the same as the type of the array elements.
To find the position (the "index") of the largest element, write
float values[100]; /* Assume the array is set up */ /* as before. */ int sub; int maxpos = 0; for ( sub = 1; values[ sub ] >= 0; sub++ ) { if ( values[ maxpos ] < values[ sub ] ) { maxpos = sub; } } /* for sub in array */ printf( "Largest value %f\n", values[ maxpos ] ); printf( "Position %d\n", maxpo );
We assume the position of the largest element is zero, and compare each other element with it in turn. We now have the position of the largest element in the "int" variable "maxpos". The type of the variable "maxpos" will always be int.
Note that there is always a unique maximum value for an element of an array; there may not be a unique position of the largest element if the are several equal largest values.
To search for the occurrences of a particular word such as "the" in the text, we must search for occurrences of the 3 characters 't', 'h' and 'e' in adjacent positions in the array.
count = 0; /* now go through the array counting */ for( i = 0; sentence[ i ] != '.'; i++ ) { if ( i < 2 ) { continue; } if( sentence[ i-2 ] == 't' && sentence[ i-1 ] == 'h' && sentence[ i ] == 'e' ) { count++; } /* end if word "the" */ } /* end search sentence */
Note that in this case we start the subscript at 2 rather than zero.
We could move the subscript from zero upwards using
count = 0; /* now go through the array counting */ if ( sentence[0] != '.' && sentence[1] != '.' ) { for( i = 0; sentence[ i+2 ] != '.'; i++ ) { if( sentence[ i ] == 't' && sentence[ i+1 ] == 'h' && sentence[ i+2 ] == 'e' ) { count++; } /* end if word "the" */ } /* end search sentence */ }
In either case, we must be careful not to run off either end of the array.
If we were actually looking for the word "the" for serious linguistic reasons, we would need to check that there were non-letters at both ends of the string "the" wherever we find it, using perhaps
if ( ( i < 3 || sentence[ i-3 ] not a letter ) && sentence[ i+1 ] not a letter ) { ...
There is a standard library function is_alpha to check whether a character given as parameter represents a letter. In addition we would probably accept either a leading upper case 'T' or lower case 't'. using a construction such as
if ( sentence[ i-2 ] == 'T' || sentence[ i-2 ] == 't' ) { ...
Instead of looking for a specific word such as "the", we may wish to look for a general word (string). We would then store the word we are looking for in a second character array. To search our stored sentence for a word of length "l_word" stored in a character array "word" (assuming that the value in l_word has been set up to equal the length of the word stored in the character array word ) the second part of the program would have to be turned into a loop to compare characters in the word with characters in the array. The code might be:
count = 0; int found; /* Now go through the sentence counting */ for( i = l_word-1; /* Start at length of word */ sentence[ i ] != '.'; /* Stop at full stop */ i++ ) { found = 1; /* 1 means true */ for ( j = 0; j < l_word; j++ ) { if( sentence[ i-l_word+j+1 ] != word[ j ] ) { found = 0; /* 0 means false */ } /* end if next letter found */ } /* end for all chars in word */ if( found ) { /* if ( found == 1 ) would do */ count++; } /* end if found */ } /* end search sentence */
The above example would be a little more easily readable if we declared
#define TRUE 1 #define FALSE 0
and use the values TRUE or FALSE later in the program.
Any operation like this will be much more easily performed by using the string library functions, perhaps "strncmp" in this case:
count = 0; /* Now go through the array counting */ for( i = 0; sentence[ i ] != '.'; i++ ) { if( strncmp( word, sentence + i, l_word ) == 0 ) { count++; } /* end if found */ } /* end search sentence */
For serious programming, always use library functions whenever they are available.
All of the string library functions assume that the characters stored in the array are terminated by a null (zero) character. Such arrays could be printed by "printf", which will print characters from a character array until a zero element is encountered.
If you wish the values of an array to be initialised, this can be done ONLY FOR GLOBAL DECLARATIONS before the line containing "main". You can write
int primes[ ] = { 1, 2, 3, 5, 7, 11 };
with the initial values separated by commas, within curly braces. You need not put a value for the length of the array between the square brackets, since the compiler can count how many elements you have declared! This would set up an array of six elements starting at suffix zero with
primes[0] = 1 primes[1] = 2 etc
If you do put a value between the square brackets, as in
int primes[ 10 ] = { 1, 2, 3, 5, 7, 11 };
an array of 10 elements will be declared, with the remaining elements (4 in this case) not initialised. The number given between the square brackets must be greater than or equal to the number of initialising values given.
To initialise a character array, you could write of course
char word[ ] = { 'E', 'r', 'i', 'c' };
An alternative notation has been devised because initialised strings are required so often. You can also write
char word[ ] = "John";
This actually initialises a 5-character array, with an additional zero element at the end; this is provided so that programs using the array can keep looping until they find the zero element. The form of a loop using this array of characters (an array with a terminating zero element) is now
int sub; /* Our subscript */ for( sub = 0; word[ sub ] != 0; sub++ ) { ....; }
We could, of course, omit the "!= 0".
Notice that "word" is a straightforward array of character elements. If we execute
word[2] = 'a';
the word becomes "Joan".
To operate on the elements of an initialised array you will need to know how many elements it has. It is bad practice to have a separate global constant giving the number of elements, declared separately from the initialised array declaration, since there is always a possibility that the length value might not agree with the actual length.
One possibility is to use the compile-time operator sizeof (deliver the size in bytes of an object) which was mentioned earlier, and write
#define arr_length \ sizeof annual_rain / sizeof annual_rain[0];
Alternatively you can put a special marker element at the end of the array, and loop through the elements until you encounter that special value. For example
int primes[ ] = { 1, 2, 3, 5, 7, 11, 0 }; for( i = 0; primes[ i ] != 0; i++ ) { ....; } /* for i loop */
Adding extra prime numbers into the declaration at a later stage will not affect the correct functioning of the loop, as long as the last element is always a zero.
This is exactly the way that strings are scanned by library functions; you may well see the lazy version written as
for( i = 0; sentence[ i ]; i++ ) { ....; } /* for i loop */
where the " "!= 0 " " is omitted.
To initialise a globally declared array take NO time while the program is running. The appropriate locations have already been initialied to the correct values in your executable file.
Arrays and variables declared in global, and not otherwise initialised, are all initialised to zero.
The above arrays are one-dimensional, and can store a single "row" or "column" or "vector" of values. Suppose we wish to store a table (two-dimensional) of values; these might be the rainfall for each of 12 months for each of 90 years, or the IQs of each of the 11 people in each of 22 football teams, or the marks for each of 20 exercises for each of 100 students, or the names (30 characters long) of each of 100 students. We now need two subscripts for each element, and would write
/* 90 * 12 variables */ #define n_yrs 90 #define n_mths 12 float month_rain[ n_yrs ][ n_mths ];
int IQ[ 22 ][ 11 ];
int mark[ 100 ][ 20 ];
char names[ 100 ][ 30 ];
To read rainfall data in, we might use
int year, month; /* for subscripts */ /* Read in all the 12 * 90 values */ for( year = 0; year < n_yrs; year++ ) { for( month = 0; month < n_mths; month++ ) { scanf( "%d", &month_rain[ year ][ month ] ); } /* Month loop */ } /* Year loop */
The numbers in the data must be given in the correct order required by the program; if the loops are nested as above, we would require the twelve numbers for the first year, then the twelve numbers for the second year, ...
If the loops had been nested the other way round (just interchange the two "for" lines) the data would have had to consist of the 90 January figures, then the 90 February figures, ...
To calculate the annual totals for each of the 90 years we might write
/* Calculate the year totals */ float total; for( year = 0; year < n_yrs; year++ ) { total = 0; for( month = 0; month < n_mths; month++ ) { total += month_rain[ year ][ month ]; } /* for month */ annual_rain[ year ] = total; } /* for year */
We could have used "annual_rain[ year ]" instead of "total" for our addition above; it would be marginally slower on the computer, since it would have to look up the array subscript each time.
To print the average rainfall over the 90 years for each of the 12 months separately, we might write
/* print monthly averages */ for( month = 0; month < n_mths; month++ ) { total = 0; for( year = 0; year < n_yrs; year++ ) { total += month_rain[ year ] [ month ]; } /* for year */ printf( "%d %f\n", month, total / n_yrs ); } /* for month */
To find the first student whose name starts with the letter 'S', we might use
int student = -1, i; for ( i = 0; i < n_students; i++ ) { if ( names[ i ][ 0 ] == 'S' ) { student = i; break; } }
leaving student set to -1 if no such student was found.
Arrays can be initialised only when declared in global. You could write for example
int table[ ][ ] = { { 1, 2, 3, 4, 5 }, { 5, 4, 3, 2, 1 }, { 1, 3, 5, 3, 1 } };
This would give us a 3 by 5 array of integers.
There is an alternative "flattened" form
int table[3][5] = { 1, 2, 3, 4, 5, 5, 4, 3, 2, 1, 1, 3, 5, 3, 1 };
In the flattened form we have to insert the bounds, since the compiler could not know whether we intended a 3 by 5 array or a 5 by 3 array.
The particular case of initialised two-dimensional character arrays which will be explained more fully later, but is introduced here since you may find it useful. We are concerned with initialising an array of words. When looking at C programs, we may wish to search for all keywords. We would declare
char *keywords[] = { "int", "float", "double", "char" "" };
This is effectively a 2-dimensional array of characters. Each row of this array is of a different length, the length of that keyword + one for the terminating zero which is always added to a string.
We could now have code to look for each keyword in turn, the element "keyword[i][j]" is the j-th character of the i-th keyword. We search along each keyword until we encounter the terminating zero.
This subject is dealt with more fully later under the subject of pointers.
You can, of course, declare and use 3-dimensional, 4-dimensional and higher dimensioned arrays by extending the above notation.
In three dimensions
float mass[10][10][10]; int x_co_ord, y_co_ord, z_co_ord; for ( x_co_ord = 0; x_co_ord < 10; x_co_ord++ ) { for ( y_co_ord = 0; y_co_ord < 10; y_co_ord++ ) { for ( z_co_ord = 0; z_co_ord < 10; z_co_ord++ ) { if ( mass[x_co_ord][y_co_ord][z_co_ord] >= min ) ... } /* z loop */ } /* y loop */ } /* x loop */
Beware that it is very easy to eat up huge amounts of storage (the above example declares 1000 variables) with many-dimensioned arrays.
We read text from standard input until end-of-file is encountered, printing each line as it is read in reflected from left to right. We use the function "getchar()" to read characters which reads spaces and newline characters as such. The "getchar()" function returns a negative value when it reaches end-of-file, so that the loop is controlled by a "while ... > 0" mechanism.
#include <iostream.h> #include <stdio.h> main() { /* Store each line as it is read */ char line[100]; char ch; int i = 0; while ( ch = getchar(), ch > 0 ) { if ( ch == '\n' ) { /* End of line, print in reverse */ while( --i >= 0 ) { printf( "%c", line[ i ] ); } printf( "\n" ); i = 0; } else { /* Ordinary character, store it */ line[ i++ ] = ch; } /* end if else end of line */ } /* end while not end of file */ } /* end main */
Observe that we have a single loop to read the characters, which takes a special action when it encounters a newline character.
Observe that, if the compiled program is called "reflect", then typing the command
cat any_file | reflect | reflect
or
reflect < any_file | reflect
should show the original file.
The purpose of structures is to group together a number of related items. The items do NOT need to be of the same type.
.N( We are analysing the properties of a proposed bridge design. For each component of the structure we need its
.N( We are working with geographical data. Each item of data consists of
.N( We are working with train timetables. For each entry in the timetable we need
We will declare first the layout (components) required of a particular type of structure; the declaration of the structure objects themselves will come later elsewhere.
A bridge component structure type might need
struct element { float strength; float length, breadth, height; int cost; float weight; char name[50]; int stock; };
Note that this merely defines a structure type. It does not define an object of any sort.
A general purpose structure for handling dates might be
struct date { int day_no, week_no, month_no; char name[4]; long int secs; };
A structure for integer data values each of which is associated with an (x,y) co-ordinate (the x and y could represent geographical latitude and longitude or Ordnance survey co-ordinates) might be
/* The structure type */ struct data_item { float x, y; int value; };
A structure for railway timetable entries might be
struct t_table { int depart; int arrive; int Fri_only; int buffet; };
The separate components of the structure are called its fields.
The above declarations are of structure types, not of structure variables for storing data values. To actually declare objects of the above structures, we might write
struct data_item this, that, result[200];
Here we have declared two single structures called this and that and an array of 200 structures; the whole array is called result .
Do not confuse the structure type declaration (uses no space, gives an identifier to the type of structure) and the variable declarations (they occupy space in the program's memory when the program is running). Typically the structure type declarations would go into a header file.
In order to access individual fields of a structure we use the structure variable identifier, followed by a full stop, then the field identifier from the structure type declaration.
/* "girder" is a "element" structure */ struct element girder; /* "all_stock" is an array of 100 structures * struct element all_stock[ 100 ]; girder.name = "Box girder type 10A"; girder.strength = 25.93; girder.stock = 0; if ( all_stock[ 10 ].stock < minimum ) { all_stock[ 10 ].required = minimum; } tot_requd = 0; for ( stock = 0; stock < 100; stock++ ) { tot_requd += all_stock[ stock ].required; } /* for stock loop */
The compiler will not be confused by the fact that "stock" is used both as a structure field identifier, and as an array identifier. The field identifier will always follow a full stop.
/* "nott_london" is an array of "t_table" */ /* for the Nottm to London timetable */ for( i = 0; i < NTTS; i++ ) { if ( nott_london[ i ].depart < .... ) { ....;
/* "result" is an array of "data_item"s */ /* add data values for all points within */ /* given "radius" circle */ int sum = 0; for ( i = 0; ...... ) { if ( result[ i ].x * result[ i ].x + result[ i ].y * result[ i ].y < radius * radius ) { sum += result[ i ].value; } } /* for i loop */
To copy a whole structure use:
all_structs[0] = girder;
or if you require merely to copy certain fields use:
all_stock[0].strength = girder.strength; all_stock[0].length = girder.length; all_stock[0].breadth = girder.breadth; all_stock[0].height = girder.height;
To initialise structures use a similar format to that for arrays with values between curly braces, as in
struct element girder = { 1234.56, 15.7, 32,6, 99,9, 7600, 50.05, "Box girder" };
For an array of structures, use either the nested braces as in
struct data_item value[] = { { 1.0, 2.0, 1323 }, { 1.5, 1.0, 4523 }, { 0.0, 2.9, 4373 }, };
or use the flattened form as in
data_item value[] = { 1.0, 2.0, 1323, 1.5, 1.0, 4523, 0.0, 2.9, 4373, };
The length of (the number of elements in) an initialised array of structures can be calculated (a compile-time constant) by using the sizeof function to divide the total size of the array by the size of one element, as in
#define nvalues \ (sizeof value / sizeof value[0]);
Do not forget always to divide by the size of a single element of the array.
For a simplified BritRail timetable, use perhaps
struct t_table { int depart, arrive ; };
for the structure type, and
struct t_table nott_lond[] = { { 537, 727 }, { 738, 909 }, { 837, 1023 }, { 1037, 1216 }, { 1237, 1412 }, { 1437, 1623 }, { 1637, 1819 }, { 2037, 2229 }, { 2214, 512 } };
for the initialised declaration. You will also need
#define num_trains \ ( sizeof nott_lond / sizeof nott_lond[0] );
In this case the structure type "t_table" represents a single timetable entry, whereas the array of structures "t_table" represents a whole timetable.
Ordinary mortals do not need to know that structure notation can be used to break a single computer word down into small bit-fields for storing small data items.
struct line_type { int i; float f; unsigned line : 5; unsigned col : 6; unsigned mode : 2; unsigned t : 2; };
The integers against each field represent the number of bits allocated. All the fields must be of unsigned type.
Whenever you have related items, they should be grouped in a structure. If a program contains two arrays of the same length, it can usually be inferred that the corresponding elements are related.
You should replace
float x[100], y[100]; int value[100]; for ( i = 0; i < 100; i++ ) printf( "%f %f\n", x[i], y[i] );
by
struct data { float x, y; int value; }; struct data point[100];
and use by
for ( i = 0; i < 100; i++ ) printf( "%f %f\n", point[i].x, point[i].y );
Program style checkers look for arrays of the same length, and recommend that they be turned into a structure.
You could try
#define MAX 100 #define ALLI i = 0; i < MAX; i++ #define EVER ;; for( ALLI ) { a[ i ] = ... } for( EVER ) { ... if ( ... ) break; ... }
These can make code more readable.
Typedef is like a "#define" for data types. To define a type `addr' which is equivalent to `int' on this machine use
typedef int addr;
and then declare
addr fred, jim;
On another machine, if could be typedef'd to another type.
struct disk = { ... }; typedef struct disk DISK; DISK a, b[10];
Copyright Eric Foxley 1996
Notes converted from troff to HTML by an Eric Foxley shell script, email errors to me!