This chapter shows how to use pointers to overcome the fixed size limitations inherent in data types such as arrays. The chapter is based around two example C++ classes. The first, Intarr, is a dynamically expandable array class which implements bounds checking on subscripts. The second is a much more complex String class for manipulating text. Its functionality includes comparison of strings, concatenation, substring matching and input and output.
In order to build these classes some new C++ concepts are introduced. These include:
These techniques turn out to be generally applicable to a range of new C++ classes.
The goal of the Intarr class is to provide a resizable array of integers which supports bounds checking.
First, it is necessary to remember the general properties of the in-built C++ array data type. An array is an ordered collection of elements all of which are of the same base type. Elements are accessed by their position in the array through a subscript using the [] operator. There are two major drawbacks to C++ arrays.
The new Intarr class will provide an array of integers with all of the usual functionality of arrays and with the above drawbacks removed. The interface to Intarr supports the following functions:
Constructor: set up an array with an initial specified length and all elements initialised to zero. Access: return a specified element from the array. return the current size of the array reset the size of the array (either increase or decrease) I/O: print out the elements in the array
This leads to the following class definition.
// C++ class definition for a dynamically resizable array of // integers with bounds checking class Intarr { private: int *a, len; public: //constructor Intarr(int length); // destructor ~Intarr(); // return nth element int& operator[](int n); // return size int size(); // reset the size void resize(int length); // print void print(); };
Notice the inclusion of the method ~Intarr. This is called the destructor for the class and is automatically called when an object of the class is destroyed (i.e. "goes out of scope"). This might occur at the end of the program, at the end of a function (for a local object) or when the user explicitly deletes an object using the delete operator. Any class may have a destructor and the destructor may take any action the user deems necessary or useful. One common use of destructors is to print out a message saying that the object has been destroyed in order to aid the debugging process. For classes which allocate their own memory for data members, a destructor is definitely required. The name of the destructor is always the name of the class with the ~ character in front of it.
This example shows that it is also possible to overload the array subscript operator []. This allows users of the class to write code such as the following:
Intarr a(6); a[1] = 2; cout << a[1];
The argument to this operator is the array subscript (an integer number). The result is an element of the array. This result MUST be returned using the reference mechanism! This is because we wish to include elements on both the right and left hand sides of assignment statements. Appearing on the right hand side means that we wish to change the original element (e.g. a[1] = 2). In other words we want the function to deliberately cause a side effect on the Intarr object and so we must have a reference result type.
Our implementation of this class requires two data methods - an array of integers to store its elements and a count of how big the array currently is. The latter is to be used in bounds checking and resizing. The array is declared as an integer pointer (remember that pointers and arrays are inter-changeable). Thus we have:
int *a, len;
The constructor function needs to allocate enough memory to store the requested number of elements and set the pointer a to point at this memory. It also needs to record the initial length of the array and to initialise all the elements to the value zero. This is achieved by the following code:
Intarr::Intarr(int length) { // allocate memory for the array a = new int[length]; // note the length len = length; // initialise all elements to 0 int i; for(i = 0; i < len; i++) a[i] = 0; }
Notice that we can use the operator new to allocate memory for an array of objects in one go by telling it how many elements there are between square brackets.
The destructor is responsible for freeing the memory allocated by the class whenever an Intarr object is destroyed. It is necessary to do this because we explicitly allocated the memory ourselves in the first place.
// destructor frees the memory Intarr::~Intarr() { // free memory for len elements delete [len]a; }
Notice that in order to free memory for an array of objects, we have to tell delete how many objects there are.
Operator[] first checks that the given subscript is in bounds and, if it is, simply returns the specified element of the array a.
// return reference to nth element int& Intarr::operator[](int n) { // check bounds (remember arrays start at 0) if( 0 <= n && n < len) // if in bounds return reference return a[n]; else { // indicate error and stop the program cout << "Error in Intarray operator[]: subscript " << n << " is out of bounds\\n"; exit(-1); } }
The size() function merely returns the value of the data member len.
// return size of array int Intarr::size() { return len; }
The resize function is a little trickier. Its first action is to set up a temporary new array with the correct new length. Next it copies across as many elements of the old array as it can into the new one. If the new array is longer, it sets the additional elements to zero. If it is shorter, it truncates the old array. Then it frees the memory for the old array using delete. Finally, is sets the data member a to point at the new array and resets len to the new length.
This technique of setting up new memory and copying values across will also be used in several later classes.
// resize array to 'length' elements void Intarr::resize(int length) { // first check that the new length is not the same as the old if ( length == len ) return; // allocate new memory and new length int *newa = new int[length]; int newlen = length; // note the shorter of the new and old lengths int min; if (newlen < len) min = newlen; else min = len; // copy across old elements // (truncate if new length is shorter) for(int i = 0; i < min; i++) newa[i] = a[i]; // set any remaining elements to 0 for(i = min; i < newlen; i++) newa[i] = 0; // free old memory delete [len]a; // set new memory and new length a = newa; len = newlen; }
Finally, the print method prints out the array in the form a,b,c ...z.
// print out array void Intarr::print() { int i; for (i = 0; i < len; i++) cout << a[i] << ","; cout << "\\n"; }
The following simple program demonstrates the syntax for using the Intarr class. Notice how the overloaded [] operator makes the class appear much more like in-built arrays. You might like to think about other possible overloaded operators for this class (e.g. overloading >> and <<).
// test the Intarr class #include <stream.h> #include "Intarr.h" main() { Intarr a(3); a[0] = 0; a[1] = 1; a[2] = 2; a.print(); cout << "size of a is" << a.size() << "\\n"; a.resize(5); a.print(); a[3] = 3; a[4] = 4; // this next one would cause an error // a[5] = 5; a.print(); a.resize(2); a.print(); }
The following briefly summarises the key new points introduced through the Intarr class:
Manipulating text is an important part of many programs. Text might come in the form of words, sentences, names, paragraphs or entire documents. C++ provides the basic char* type for manipulating text. However, this is extremely limited. In effect, we can only:
There are many additional facilities that might be included in a general purpose string class.
A string class that supported such functionality would be extremely useful for many applications. Our goal is to design and implement such a class. There are many additional functions that might be useful that we won't consider in our class. These include case conversion, encryption and reversing strings.
My analysis of the String class led to the following interface design.
Constructors: build an empty string; a blank string of a given length; a string from an existing char*; a string from another string. Combination: concatenate two strings to form a new one (overload operator +). Tests: Test for empty string (!) as well as compare strings ( <, >, <=, >=, ==, !=). Also does one string contain another? Access: Return the length of the string. Access individual letters (overloaded []). Input/output: Overload >> and << to read and write strings
The analysis above leads to the following class definition.
// definition of a character string class // Steve Benford Nov 1990, Revised Nov 1991 #include <stream.h> class String { private: char* str; // pointer to character array int len; // length of the string; public: // constructors String(char *s); // allocate and initialise String(int i); // allocate length and set to blanks String(); // allocate to zero length String(const String& s); // copy constructor // destructor ~String(); // destructor; // combination void operator=(String& s); // assignment between two strings String operator+(String& s); // concatenation // tests int operator!(); // test whether 0 int operator==(String& s); // test for equality int operator!=(String& s); // test for inequality int operator<(String& s); // test alphabetically less than int operator>(String& s); // test alphabetically greater than int operator<=(String& s); // test alphabetically less than/equals int operator>=(String& s); // test alphabetically greater/equals int contains(String& s); // does one string contain another? // access char& operator[](int n); // return char in position n int length(); // return length of the string // I/O // overloaded output operator friend ostream& operator<<(ostream& os, String &s); // overloaded input operator // read a sequence of characters ending in a newline into parameter s // s will be set to NULL if the line length is greater than // a specific length (BUF_LEN = 200) friend istream& operator>>(istream& is, String &s); };
The class provides four constructors including the copy constructor String(const String& s) which builds one string from another. Note the word "const" in front of the argument declaration. Remember that we pass the String argument by reference for reasons of efficiency, but that this can cause side-effects (i.e. our constructor could change the string object that is its argument). The use of const prohibits these side-effects. In general, you can use const arguments when using references for increased efficiency both where you want to stop possible side-effects.
The String class provides a destructor and also makes extensive use of operator overloading, including overloaded << and >> operators. This also includes overloading the assignment operator, void operator=(String& s), something that we haven't seen before. This will be fully explained below.
Internally, the String class resembles the Intarr class in that there are two data members - a char* pointer to store the string and an integer to record its current length.
The first step of the implementation is to define two generally useful functions strlength and strcopy which calculate the length of a char* and copy one char* to another respectively. In fact, there are standard versions of these functions available in the string library called strcpy and strlen. However, it is worth understanding them for yourselves. It is important to note that strcopy assumes that the right amount memory has already been allocated for the new string before the function is called. If it hasn't, the function will fail and will probably cause the program to crash.
// generally useful functions // calculate length of char* int strlength(char* c) { int i = 0; // set up a new pointer to the front of the string char *p = c; // keep going till you reach the end of the string (i.e. null) while(*p++ != '\\0') i++; return i; } // copy one char* to another char* (assume space allocated) void strcopy(char *s, char *t) { // while not at end of string while(*t != '\\0') { // copy the current characters *s = *t; // move the pointers one notch along the strings s++; t++; } // put the null character at the end *s = '\\0'; }
The first two constructors set up an empty string and a string of n blanks (space characters).
// construct an empty String String::String() { // length is currently zero len = 0; // set up array with end of string in it str = new char[1]; str[0] = '\\0'; } // construct a String initialised to n blanks String::String(int n) { len = n; // allocate enough space for the string (+ null) str = new char[len+1]; // copy blanks into the memory for(int i=0; i<n; i++) str[i] = ' '; // set end of string marker str[n] = '\\0'; }
Constructing a string from a char* involves recording the length of the string, allocating memory for the internal str pointer and then copying the char* pointed at by the argument s to this memory. Note that we must allocate anough memory for the null character ('\\0') as well (i.e. len+1 chars).
// construct a String from a char* String::String(char *s) { len = strlength(s); // allocate enough space for the string (+ null) str = new char[len+1]; // copy the string into this memory strcopy(str,s); }
The copy constructor is similar, except that we copy the internal char* given by s.str.
// copy constructor' one string from another String::String(const String& s) { // set the length len = s.len; // allocate space str = new char[len + 1]; // fill it up strcopy(str, s.str); }
The destructor is responsible for freeing the memory allocated to the string.
// destructor String::~String() { // delete the memory associated with the string delete str; }
Concatenation involves setting up a new temporary string whose length is the combined length of the strings to be joined. The contents of both strings are then copied into this new memory.
// concatenation String String::operator+(String& s) { // initialise result string of correct length String res(len + s.len); // copy in first part strcopy(res.str, str); // copy in second part for (int i = 0; i< s.len; i++) res.str[len + i] = s.str[i]; // set end of string marker res.str[res.len] = '\\0'; // return result return res; }
The following code implements the various test methods in the string class. Notice that when testing equality we first compare the lengths of the strings before each character in turn.
// test for empty string int String::operator!() { return !len; } // test for inequality int String::operator!=(String& s) { // first, are strings the same length? if(len != s.len) return 1; // if they are, test individual characters for(int i=0; i < len; i++) if (str[i] != s.str[i]) return 1; // if we get here they are the same return 0; } // test for equality int String::operator==(String& s) { // are strings the same length? if(len != s.len) return 0; // test individual characters for(int i=0; i < len; i++) if (str[i] != s.str[i]) return 0; // if we get here they are the same return 1; } // test whether alphabetically less than int String::operator<(String& s) { // set two pointers to the start of each string char *p = str, *q = s.str; // compare each character in turn while not at end of either string while((*p != '\\0') && (*q != '\\0')) { if (*p < *q) return 1; if (*p > *q) return 0; // move pointers along the string p++; q++; } // is q longer than p if ((*p == '\\0') && (*q != '\\0')) return 1; else return 0; } // test alphabetically greater than int String::operator>(String& s) { // set two pointers to the start of each string char *p = str, *q = s.str; // compare each character in turn while not at end of either string while((*p != '\\0') && (*q != '\\0')) { if (*p > *q) return 1; if (*p < *q) return 0; // move pointers along the string p++; q++; } // is p longer than q? if ((*p != '\\0') && ( *q == '\\0') ) return 1; else return 0; } // test alphabetically less than/equals int String::operator<=(String& s) { // set two pointers to the start of each string char *p = str, *q = s.str; // compare each character in turn while not at end of either string while((*p != '\\0') && (*q != '\\0')) { if (*p < *q) return 1; if (*p > *q) return 0; // move pointers along the string p++; q++; } // we are at the end of string p? if ((*p == '\\0')) return 1; else return 0; } // test alphabetically greater/equals int String::operator>=(String& s) { // set two pointers to the start of each string char *p = str, *q = s.str; // compare each character in turn while not at end of either string while((*p != '\\0') && (*q != '\\0')) { if (*p > *q) return 1; if (*p < *q) return 0; // move pointers along the string p++; q++; } // are we at the end of string q? if ((*q == '\\0')) return 1; else return 0; } // does one string contain another int String::contains(String& s) { int i,j,match; // test lengths first to see if contained is longer than container if(s.len > len) return 0; // go down string looking to match first character // note: we don't need to go to the end of the array for(i = 0; i < len - s.len; i++) // do characters match? if(str[i] == s.str[0]) { // if so ASSUME a positive match for the whole word match = 1; // now check a character at a time to see // if the assumption was true for(j = 0; j < s.len; j++) // oops, it wasn't if(str[i+j] != s.str[j]) match = 0; // do we still have a positive match? if(match) return 1; } // if we got here there was no match return 0; }
Like in the Intarr class, the overloaded [] operator allows access to individual elements of an array object.
// index an element of the String using the array [] operator char& String::operator[](int n) { // check bounds if(n > len) { cout << "String error: reference out of bounds\\n"; exit(-1); } else return str[n]; } // return the length of the string int String::length() { return len; }
Here are the overloaded input and output operators.
// overloaded output operator (friend function) ostream& operator<<(ostream& os, String &s) { if(s.len > 0) os << s.str; return os; } // overloaded input operator (friend function) // the strategy is to read into an array one character at a time // (watching for overflow of the array bounds) // and then to use the array to initialise a string object istream& operator>>(istream& is, String &s) { // temporary array to hold input const int BUF_LEN = 200; char buffer[BUF_LEN]; // read a character at a time until end of line or buffer overflow int i = 0; // keep going till overflow while(i < BUF_LEN) { // read next character (use get function) is.get(buffer[i]); // check for end of input (i.e. newline) if(buffer[i] == '\\n') { // put in end of string character buffer[i] = '\\0'; // initialise the string object s s = String(buffer); // return return is; } // increment i i++; } // the buffer has overflowed if we get here // so set s to be the empty String String temp; s = temp; return is; }
In this section we discuss overloading the assignment operator ( = ). First we find out the motivation for doing this. Then we look at how it is achieved.
C++ allows us to assign objects of user-defined classes directly to each other. Thus, the following is valid.
Point a(1,0), b(2,1); b = a;
When the assignment takes place, the computer simply copies the values of the data members of b into those of a. In other words,
b.X = a.X; b.Y = a.Y
This simple approach is called memberwise copying and is the default technique for copying objects. It works fine for simple classes with automatically allocated memory (e.g. Point, Counter and Money). However, it fails for classes which allocate their own memory due to the effect of copying pointers. Consider the following.
String a("Hi"), b("Yo"); b = a;
This has the effect of:
b.len = a.len; b.str = a.str;
This produces the following effect:
There are now two major problems:
The solution to this problem is that we need to replace the memberwise copying assignment mechanism with something more sophisticated. This is achieved by overloading the assignment operator (operator=). This operator is called when ever two objects are copied - THIS INCLUDES PASSING FUNCTION ARGUMENTS AND RESULTS BY VALUE!
We want our version to do the following:
This is achieved by the following code.
// assignment void String::operator=(String& s) { // first delete the old memory delete str; // copy across new string len = s.len; str = new char[len + 1]; strcopy(str, s.str); }
It is crucial to note that this operator will be called for argument and result passing in functions as well as for direct assignment.
It will be necessary to overload the assignment operator for any class which allocates and frees its own memory.
In summary, the String class has had to consider memory allocation and deletion in the following places:
It is possible to think of many interesting extensions to the String class:
Notes converted from troff to HTML by an Eric Foxley shell script, email errors to me!