Chapter 6: Expandable ADTs

Chapter 6: Expandable ADTs

Contents

6.1. Introduction

This chapter shows how to use pointers to overcome the fixed size limitations inherent in data types such as arrays. The chapter is based around two example C++ classes. The first, Intarr, is a dynamically expandable array class which implements bounds checking on subscripts. The second is a much more complex String class for manipulating text. Its functionality includes comparison of strings, concatenation, substring matching and input and output.

In order to build these classes some new C++ concepts are introduced. These include:

These techniques turn out to be generally applicable to a range of new C++ classes.

6.2. Intarr - a dynamic array

The goal of the Intarr class is to provide a resizable array of integers which supports bounds checking.

First, it is necessary to remember the general properties of the in-built C++ array data type. An array is an ordered collection of elements all of which are of the same base type. Elements are accessed by their position in the array through a subscript using the [] operator. There are two major drawbacks to C++ arrays.

  • Once defined, they have a fixed size. This makes them unsuitable for storing highly dynamic kinds of information such as general text (the length of a sentence can vary greatly).
  • The [] operator doesn't check whether the given subscript falls within the array bounds. This makes arrays dangerous to use.

    6.2.1. Definition of the Intarr class

    The new Intarr class will provide an array of integers with all of the usual functionality of arrays and with the above drawbacks removed. The interface to Intarr supports the following functions:

    This leads to the following class definition.

    Notice the inclusion of the method ~Intarr. This is called the destructor for the class and is automatically called when an object of the class is destroyed (i.e. "goes out of scope"). This might occur at the end of the program, at the end of a function (for a local object) or when the user explicitly deletes an object using the delete operator. Any class may have a destructor and the destructor may take any action the user deems necessary or useful. One common use of destructors is to print out a message saying that the object has been destroyed in order to aid the debugging process. For classes which allocate their own memory for data members, a destructor is definitely required. The name of the destructor is always the name of the class with the ~ character in front of it.

    This example shows that it is also possible to overload the array subscript operator []. This allows users of the class to write code such as the following:

    The argument to this operator is the array subscript (an integer number). The result is an element of the array. This result MUST be returned using the reference mechanism! This is because we wish to include elements on both the right and left hand sides of assignment statements. Appearing on the right hand side means that we wish to change the original element (e.g. a[1] = 2). In other words we want the function to deliberately cause a side effect on the Intarr object and so we must have a reference result type.

    6.2.2. Implementation of the Intarr class

    Our implementation of this class requires two data methods - an array of integers to store its elements and a count of how big the array currently is. The latter is to be used in bounds checking and resizing. The array is declared as an integer pointer (remember that pointers and arrays are inter-changeable). Thus we have:

    The constructor function needs to allocate enough memory to store the requested number of elements and set the pointer a to point at this memory. It also needs to record the initial length of the array and to initialise all the elements to the value zero. This is achieved by the following code:

    Notice that we can use the operator new to allocate memory for an array of objects in one go by telling it how many elements there are between square brackets.

    The destructor is responsible for freeing the memory allocated by the class whenever an Intarr object is destroyed. It is necessary to do this because we explicitly allocated the memory ourselves in the first place.

    Notice that in order to free memory for an array of objects, we have to tell delete how many objects there are.

    Operator[] first checks that the given subscript is in bounds and, if it is, simply returns the specified element of the array a.

    The size() function merely returns the value of the data member len.

    The resize function is a little trickier. Its first action is to set up a temporary new array with the correct new length. Next it copies across as many elements of the old array as it can into the new one. If the new array is longer, it sets the additional elements to zero. If it is shorter, it truncates the old array. Then it frees the memory for the old array using delete. Finally, is sets the data member a to point at the new array and resets len to the new length.

    This technique of setting up new memory and copying values across will also be used in several later classes.

    Finally, the print method prints out the array in the form a,b,c ...z.

    6.2.3. Testing the Intarr class

    The following simple program demonstrates the syntax for using the Intarr class. Notice how the overloaded [] operator makes the class appear much more like in-built arrays. You might like to think about other possible overloaded operators for this class (e.g. overloading >> and <<).

    The following briefly summarises the key new points introduced through the Intarr class:

  • The constructor needs to allocate memory using the new operator.
  • The destructor needs to free this memory using the delete operator.
  • Both new and delete can deal with arrays of objects provided they are told how big the arrays are.
  • The overloaded [] operator must return a reference result so that it can be used on the left hand as well as the right hand side of assignment statements.
  • The resize method works by setting up new memory, copying across data from old memory and then deleting the old memory.

    6.3. The String class

    Manipulating text is an important part of many programs. Text might come in the form of words, sentences, names, paragraphs or entire documents. C++ provides the basic char* type for manipulating text. However, this is extremely limited. In effect, we can only:

  • Point to a string (e.g. char *s = "Hello").
  • Index individual letters of a string (e.g. if ( s[0] == 'H') ....
  • Display a string (e.g. cout << s).

    There are many additional facilities that might be included in a general purpose string class.

  • The ability to copy strings to each other.
  • The ability to concatenate (i.e. join-together) strings to form new strings.
  • The ability to compare strings based on dictionary ordering (e.g. <, >, == etc).
  • The ability to read strings using the >> operator and print them using <<.

    A string class that supported such functionality would be extremely useful for many applications. Our goal is to design and implement such a class. There are many additional functions that might be useful that we won't consider in our class. These include case conversion, encryption and reversing strings.

    6.3.1. Definition of the String class

    My analysis of the String class led to the following interface design.

    The analysis above leads to the following class definition.

    The class provides four constructors including the copy constructor String(const String& s) which builds one string from another. Note the word "const" in front of the argument declaration. Remember that we pass the String argument by reference for reasons of efficiency, but that this can cause side-effects (i.e. our constructor could change the string object that is its argument). The use of const prohibits these side-effects. In general, you can use const arguments when using references for increased efficiency both where you want to stop possible side-effects.

    The String class provides a destructor and also makes extensive use of operator overloading, including overloaded << and >> operators. This also includes overloading the assignment operator, void operator=(String& s), something that we haven't seen before. This will be fully explained below.

    6.3.2. Implementation of the String class

    Internally, the String class resembles the Intarr class in that there are two data members - a char* pointer to store the string and an integer to record its current length.

    The first step of the implementation is to define two generally useful functions strlength and strcopy which calculate the length of a char* and copy one char* to another respectively. In fact, there are standard versions of these functions available in the string library called strcpy and strlen. However, it is worth understanding them for yourselves. It is important to note that strcopy assumes that the right amount memory has already been allocated for the new string before the function is called. If it hasn't, the function will fail and will probably cause the program to crash.

    The first two constructors set up an empty string and a string of n blanks (space characters).

    Constructing a string from a char* involves recording the length of the string, allocating memory for the internal str pointer and then copying the char* pointed at by the argument s to this memory. Note that we must allocate anough memory for the null character ('\\0') as well (i.e. len+1 chars).

    The copy constructor is similar, except that we copy the internal char* given by s.str.

    The destructor is responsible for freeing the memory allocated to the string.

    Concatenation involves setting up a new temporary string whose length is the combined length of the strings to be joined. The contents of both strings are then copied into this new memory.

    The following code implements the various test methods in the string class. Notice that when testing equality we first compare the lengths of the strings before each character in turn.

    Like in the Intarr class, the overloaded [] operator allows access to individual elements of an array object.

    Here are the overloaded input and output operators.

    6.4. Overloading the assignment operator

    In this section we discuss overloading the assignment operator ( = ). First we find out the motivation for doing this. Then we look at how it is achieved.

    C++ allows us to assign objects of user-defined classes directly to each other. Thus, the following is valid.

    When the assignment takes place, the computer simply copies the values of the data members of b into those of a. In other words,

    This simple approach is called memberwise copying and is the default technique for copying objects. It works fine for simple classes with automatically allocated memory (e.g. Point, Counter and Money). However, it fails for classes which allocate their own memory due to the effect of copying pointers. Consider the following.

    This has the effect of:

    This produces the following effect:

    There are now two major problems:

  • a.str points at the wrong place. What happens if b changes or is deleted?
  • The memory containing "Yo" is now hanging.

    The solution to this problem is that we need to replace the memberwise copying assignment mechanism with something more sophisticated. This is achieved by overloading the assignment operator (operator=). This operator is called when ever two objects are copied - THIS INCLUDES PASSING FUNCTION ARGUMENTS AND RESULTS BY VALUE!

    We want our version to do the following:

  • Free the old memory pointed at by b.str.
  • Allocate sufficient new memory for b.str.
  • Copy across the contents of a.str into b.str.

    This is achieved by the following code.

    It is crucial to note that this operator will be called for argument and result passing in functions as well as for direct assignment.

    It will be necessary to overload the assignment operator for any class which allocates and frees its own memory.

    In summary, the String class has had to consider memory allocation and deletion in the following places:

  • Constructors
  • Destructors
  • Concatenation
  • Assignment

    It is possible to think of many interesting extensions to the String class:

  • Reformating - case conversion, encryption, reversal, compression
  • Matching - wildcards, case-ignore
  • Combination - splitting, selecting parts


    Notes converted from troff to HTML by an Eric Foxley shell script, email errors to me!