4 Classes and memory allocation

Contents of this section

We're always interested in getting feedback. E-mail us if you like this guide, if you think that important material is omitted, if you encounter errors in the code examples or in the documentation, if you find any typos, or generally just if you feel like e-mailing. Mail to Karel Kubat (karel@icce.rug.nl) or use an e-mail form . Please state the concerned document version, found in the title. If you're interested in a printable PostScript copy, use the form . or better yet, pick up your own copy via ftp at ftp.icce.rug.nl/pub/http ,

In contrast to the set of functions which handle memory allocation in C (i.e., malloc() etc.), the operators new and delete are specifically meant to be used with the features that C++ offers. Important differences between malloc() and new are:

The comparison between free() and delete is analogous: delete makes sure that when an object is deallocated, a corresponding destructor is called.

The calling of constructors and destructors when objects are created or destroyed, has a number of consequences which shall be discussed in this chapter. Many problems in program development in C are caused by incorrect memory allocation or memory leaks: memory is not allocated, not freed, not initialized, boundaries are overwritten, etc.. C++ does not `magically' solve these problems, but it does provide a number of handy tools. In this chapter the following topics are discussed:

4.1 Classes with pointer data members

In this section we shall again use the class Person as example:

    class Person
    {
        public:
            // constructors and destructor
            Person ();
            Person (char const *n, char const *a,
                    char const *p);
            ~Person ();

            // interface functions
            void setname (char const *n);
            void setaddress (char const *a);
            void setphone (char const *p);

            char const *getname (void) const;
            char const *getaddress (void) const;
            char const *getphone (void) const;

        private:
            // data fields
            char *name;
            char *address;
            char *phone;
    };

In this class the destructor is necessary to prevent that memory, once allocated for the fields name, address and phone, becomes unreachable when an object ceases to exist. In the following example a Person object is created, after which the data fields are printed. After this the main() function stops, which leads to the deallocation of memory. The destructor of the class is also shown for illustration purposes.

Note that in this example an object of the class Person is also created and destroyed using a pointer variable; using the operators new and delete.

    Person::~Person ()
    {
        delete name;
        delete address;
        delete phone;
    }

    void main ()
    {
        Person
            kk ("Karel", "Rietveldlaan",
                "050-426044"),
            *bill = new Person ("Bill Clinton",
                   "White House",
                   "09-1-202-142-3045")

        printf("%s, %s, %s\n",
            kk.getname (), kk.getaddress (), kk.getphone ());
        printf("%s, %s, %s\n",
            bill->getname (), bill->getaddress (), bill->getphone ());

        delete bill;
    }

The memory which is occupied by the object kk is released automatically when main() terminates: the C++ compiler makes sure that the destructor is called. Note however the object pointed to by bill is handled differently. The variable bill is a pointer; and a pointer variable is, even in C++, in itself no Person. Therefore, before main() terminates, the memory occupied by the object pointed to by bill must be explicitly released; hence the statement delete bill. The operator delete will make sure that the destructor is called, thereby releasing the three strings of the object.

4.2 The assignment operator

Variables which are structs or classes can be directly assigned in C++ in the same way that structs can be assigned in C. The default action of such an assignment is a byte-by-byte copying from one compound type to the other.

Let us now consider the consequences of this default action in a program statement as the following:

    void printperson (Person const &p)
    {
        Person
            tmp;

        tmp = p;
        printf ("Name:     %s\n"
                "Address:  %s\n"
                "Phone:    %s\n",
            tmp.getname (), tmp.getaddress (), tmp.getphone ());
    }

We shall follow the execution of this function step by step.

After the execution of printperson() , the object which was referenced by p may still contain valid pointers to strings, but pointers which address deallocated memory. This action is undoubtedly not a desired effect of a function like the above. The deallocated memory will likely become occupied during subsequent allocations, thereby causing the previously held strings to become lost.

In general it can be concluded that every class which contains a constructor and a destructor, and which contains pointer fields to address allocated memory, is a potential candidate for trouble. There is of course a possibility to intervene: this possibility will be discussed in the next section.

Overloading the assignment operator

Obviously, the right way to assign one Person object to another, is not to copy the contents of the object byte by byte. A better way is to make an equivalent object; one with its own allocated memory, but which contains the same strings.

The `right' way to dupliate a Person object is illustrated in the following figure.

picture

There is a number of solutions for the above wish. One solution consists of the definition of a special function to handle assignments of objects of the class Person. The purpose of this function would be to create a copy of an object, but one with its own name, address and phone strings. Such a member function might be:

    void Person::assign (Person const &other)
    {
        // delete our own previously used memory
        delete name;
        delete address;
        delete phone;

        // now copy the other's data
        name = strdup (other.name);
        address = strdup (other.address);
        phone = strdup (other.phone);
    }

Using this tool we could rewrite the offending function func():

    void printperson (Person const &p)
    {
        Person
            tmp;

        // make tmp a copy of p, but with its own allocated
        // strings
        tmp.assign (p);
        
        printf ("Name:     %s\n"
                "Address:  %s\n"
                "Phone:    %s\n",
            tmp.getname (), tmp.getaddress (), tmp.getphone ());

        // now it doesn't matter that tmp gets destroyed..
    }

In itself this solution is valid, although it is purely symptomatic. This solution requires that the programmer uses a specific member function instead of the operator =; the problem however remains if this rule is not strictly adhered to. Our experience shows that errare humanum est; a solution which doesn't enforce exceptions is therefore preferable.

The problem of the assignment operator is solved by using operator overloading: the syntactic possibility of C++ to redefine the actions of an operator in a given context. Operator overloading was discussed earlier, when the operators << and >> were redefined for the usage with streams as cin, cout and cerr (see section CoutCinCerr ).

Overloading the assignment operator is probably the most common form of operator overloading. However, a word of warning is appropriate: the fact that C++ allows operator overloading does not mean that this feature should be used at all times. A few rules are:

Using these rules, operator overloading is minimized which helps keep source files readable. An operator simply does what it is designed to do. Therefore, in our vision, the operators << and >> in the context of streams are misleading: the stream operations do not have anything in common with the bitwise shift operations.

The function operator=()

To achieve operator overloading in a context of a class, the class is simply expanded with a public function which states the operator. A corresponding function is then defined.

For example, to overload the addition operator + a function operator+() would be defined. The function name consists of the keyword operator and the operator itself.

In our case we define a new function operator=() to redefine the actions of the assignment operator. A possible extension to the class Person could therefore be:

    // new declaration of the class
    class Person
    {
        public:
            .
            .
            void operator= (Person const &other);
            .
            .
        private:
            .
            .
    };

    // definition of the function
    void Person::operator= (Person const &other)
    {
        // deallocate old data
        delete name;
        delete address;
        delete phone;

        // make duplicates of other's data
        name = strdup (other.name);
        address = strdup (other.address);
        phone = strdup (other.phone);
    }

The function operator=() which is presented above is the first version of the overloaded assignment. We shall present better and less bug-prone versions later.

The actions of this member function are similar to those of the previously proposed function assign(), but the name makes sure that this function is also activated when the assignment operator = is used. In fact there are two ways to call this function, which are illustrated below:

    Person
        pers ("Frank", "Oostumerweg 23", "2223"),
        copy;
        
    // first possibility
    copy = pers;

    // second possibility
    copy.operator= (pers);

It is obvious that the second possibility, in which operator=() is explicitly stated, is not used often. The code fragment however illustrates the similarity of the two methods of calling the function.

4.3 The this pointer

As we have seen, a member function of a given class is always called in the context of some object of the class; there is always an implicit `substrate' for the function to act on. C++ defines a keyword, this, to address this substrate (this is not available in the not yet discussed static member functions) . The this keyword is a pointer variable, which always contains the address of the object in question. The this pointer is implicitly declared in each member function (whether public or private); therefore, it is as if in each member function of the class Person would contain the following declaration:

    extern Person *this;

A member function like setname(), which sets a name field of a Person to a given string, could therefore be implemented in two ways: with or without the this pointer:

    // alternative 1: implicit usage of this
    void Person::setname (char const *n)
    {
        delete name;
        name = strdup (n);
    }

    // alternative 2: explicit usage of this
    void Person::setname (char const *n)
    {
        delete this->name;
        this->name = strdup (n);
    }

Explicit usage of the this pointer is not very frequent. There is however a number of situations where the this pointer is needed.

Preventing self-destruction with this

As we have seen, the operator = can be redefined for the class Person in such a way that two objects of the class can be assigned, leading to two copies of the same object.

As long as the two variables are different ones, the previously presented version of the function operator=() will function properly: the memory of the assigned object is released, after which it is allocated again to hold new strings. However, when an object is assigned to itself (which is called auto-assignment), a problem occurs: the allocated strings of the assigned are first released, but this also leads to the releasing of the strings of the right-hand side variable! An example of this situation is illustrated below:

    void fubar (Person const &p)
    {
        p = p;          // auto-assignment!
    }

In this example it is perfectly clear that something unnecessary, possibly even wrong, is happening. Auto-assignment can however occur in more hidden forms:

    Person
        one,
        two,
        *pp;

    pp = &one;
    .
    .
    *pp = two;
    .
    .
    one = *pp;

The problem of the auto-assignment can be solved by using the this pointer. In the overloaded assignment operator function we simply test whether the address of the right-hand side object is the same as the address of the current object: if so, no action needs to be taken. The definition of the function operator=() then becomes:

    void Person::operator= (Person const &other)
    {
        // only take action if address of current object
        // (this) is NOT equal to address of other
        // object (&other):

        if (this != &other)
        {
            delete name;
            delete address;
            delete phone;

            name = strdup (other.name);
            address = strdup (other.address);
            phone = strdup (other.phone);
        }
    }

This is the second version of the overloaded assignment function. One, yet better version remains to be discussed.

Note the usage of the address operator in the statement

    if (this != &other)

The variable this is a pointer to the `current' object, while other is a reference; which is an `alias' to an actual Person object. The address of the other object is therefore &other, while the address of the current object is this.

Associativity of operators and this

The syntax of C++ states that the associativity of the assignment operator is to the right-hand side; i.e., in a statement as

    a = b = c;

the expression b = c is evaluated first, and the result is assigned to a.

The implementation of the overloaded assignment operator so far does not permit such constructions, as an assignment using the member function returns nothing (void). We can therefore conclude that the previous implementation does circumvent an allocation problem, but is not quite syntactically right.

The syntactical problem can be illustrated as follows. When we rewrite the expression a = b = c to the form which explicitly mentions member functions, we get:

    a.operator= (b.operator= (c));

This is syntactically wrong, since the sub-expression b.operator=(c) yields void; and the class Person contains no member functions with the prototype operator=(void).

This problem can also be remedied by using the this pointer. The overloaded assignment function expects as its argument a reference to a Person object; in the same way it can return a reference to such an object. This reference can then be used as an argument for a nested assignment.

It is customary to let the overloaded assignment return a reference to the current object (i.e., *this), as a const reference. The (final) version of the overloaded assignment operator for the class Person thus becomes:

    // declaration in the class
    class Person
    {
        public:
            .
            .
            Person const &operator= (Person const &other)
            .
            .
    };

    // definition of the function
    Person const &Person::operator= (Person const &other)
    {
        // only take action when no auto-assignment occurs
        if (this != &other)
        {
            // deallocate own data
            delete address;
            delete name;
            delete phone;

            // duplicate other's data
            address = strdup (other.address);
            name = strdup (other.name);
            phone = strdup (other.phone);
        }

        // return current object, compiler will make sure
        // that a const reference is returned
        return (*this);
    }

4.4 The copy constructor: Initialization vs. assignment

In the following sections we shall look closer at another usage of the operator =. We shall use a class String as an example. This class is meant to handle allocated strings and is defined as follows:

    class String
    {
        public:
            // constructor, destructors
            String ();
            String (char const *s);
            ~String ();

            // overloaded assignment
            String const &operator= (String const &other);

            // interface functions
            void set (char const *data);
            char const *get (void);

        private:
            // one data field: ptr to allocated string
            char *str;
    };

Concerning this definition we remark the following:

Now let's consider the following code fragment. The statement references are discussed below the example:

    String
        a ("Hello World\n"),            // see (1)
        b,                              // see (2)
        c = a;                          // see (3)

    int main ()
    {
        b = c;                          // see (4)
        return (0);
    }

The simple rule which applies here is that whenever an object is created, a constructor is needed. The form of the constructor is still the following:

We conclude therefore that, given the above code statement (3), the class String must be rewritten to define a copy constructor:

    // class definition
    class String
    {
        public:
            .
            .
            String (String const &other);
            .
            .
    };

    // constructor definition
    String::String (String const &other)
    {
        str = strdup (other.str);
    }

The actions of the copy constructor are similar to those of the overloaded assignment operator function: an object is duplicated, so that it contains its own allocated data. The copy constructor function is however simpler in the following respect:

Besides the above mentioned quite obvious usage of the copy constructor, this constructor has other important tasks. All of these tasks are related to the fact that the copy constructor is always called when an object is created and initialized with another object; even when this new object is a hidden or temporary variable:

To demonstrate that copy constructors are not called in all situations, consider the following. We could rewrite the above function getline() to the following form:

    String getline ()
    {
        char
            buf [100];          // buffer for kbd input

        gets (buf);             // read buffer
        return (buf);           // and return it
    }

This code fragment is quite valid, even though the return value char* doesn't match the prototype String. In this situation, C++ will try to convert the char* to a String: this is indeed possible, given a constructor which expects a char* argument. This means that the copy constructor is not used in this version of getline(). Instead, the constructor expecting a char* argument is used.

Similarities between the copy constructor and operator=()

The similarities between on one hand the copy constructor and on the other hand the overloaded assignment operator are reinvestigated in this section. We present here two primitive functions which often occur in `our' code, and which we think are quite useful. We remark that:

The two above actions (duplication and deallocation) can be coded in two primitive functions, say copy() and destroy(), which are used in the overloaded assignment operator, the copy constructor, and the destructor. When we apply this method to the class Person, we can rewrite the code as follows.

First, the class definition is expanded with two private functions copy() and destroy(). The purpose of these functions is to unconditionally copy the data of another object or to deallocate the memory of the current object. Hence these functions implement `primitive' functionality:

    // class definition, only relevant functions are shown here
    class Person
    {
        public:
            // constructors, destructor
            Person (Person const &other);
            ~Person ();

            // overloaded assignment
            Person const &operator= (Person const &other);
            .
            .
        private:
            // data fields
            char *name, *address, *phone;

            // the two primitives
            void copy (Person const &other);
            void destroy (void);
    };

Next, we present the implementation of the functions copy() and destroy():

    // copy(): unconditionally copy other object's data
    void Person::copy (Person const &other)
    {
        name = strdup (other.name);
        address = strdup (other.address);
        phone = strdup (other.phone);
    }

    // destroy(): unconditionally deallocate data
    void Person::destroy ()
    {
        delete name;
        delete address;
        delete phone;
    }

Finally the three public functions in which other object's memory is copied or in which memory is deallocated are rewritten:

    // copy constructor
    Person::Person (Person const &other)
    {
        // unconditionally copy other's data
        copy (other);
    }

    // destructor
    Person::~Person ()
    {
        // unconditionally deallocate
        destroy ();
    }

    // overloaded assignment
    Person const &Person::operator= (Person const &other)
    {
        // only take action if no auto-assignment
        if (this != &other)
        {
            destroy ();
            copy (other);
        }

        // return (reference to) current object for
        // chain-assignments
        return (*this);
    }

4.5 More operator overloading

The following sections present more examples of operator overloading.

Overloading operator[]

As one more example of operator overloading, we present here a class which is meant to represent an array of ints. Indexing the array elements occurs with the standard array operator [], but additionally the class checks for boundary overflow.

An example of the usage of the class is given below:

    int main ()
    {
        Intarray
            x (20);             // 20 ints

        for (register int i = 0; i < 20; i++)
            x [i] = i * 2;      // assign the elements

        for (i = 0; i <= 20; i++)
            printf ("At index %d: value %d\n",
                    i, x [i]);

        return (0);
    }

This example shows how an array is created to hold 20 ints. The elements of the array can be assigned or retrieved. The above example should produce a run-time error, which is generated by the class Intarray: the last for loop causes a boundary overflow, since x[20] is addressed while legal indexes are range from 0 to 19.

The definition of the class is given below:

    class Intarray
    {
         public:
            // constructors, destructor etc.
            Intarray (int sz = 1);          // default size: 1 int
            Intarray (Intarray const &other);
            ~Intarray ();
            Intarray const &operator= (Intarray const &other);

            // the interface
            int &operator[] (int index);

        private:
            // data
            int *data, size;
    };

Concerning this class definition we remark:

The member functions of the class are given below.

    // constructor
    Intarray::Intarray (int sz)
    {
        // check for legal size specification
        if (sz < 1)
        {
            printf ("Intarray: size of array must be >= 1, not %d!\n", sz);
            exit (1);
        }

        // remember size, create array
        size = sz;
        data = new int [sz];
    }

    // copy constructor
    Intarray::Intarray (Intarray const &other)
    {
        // set size
        size = other.size;

        // create array
        data = new int [size];

        // copy other's values
        for (register int i = 0; i < size; i++)
            data [i] = other.data [i];
    }

    // overloaded assignment
    Intarray const &Intarray::operator= (Intarray const &other)
    {
        // take action only when no auto-assignment
        if (this != &other)
        {
            // set size
            size = other.size;
            // remove previous memory, create new array
            delete [] data;
            data = new int [size];
            // copy other's data
            for (register int i = 0; i < size; i++)
                data [i] = other.data [i];
        }
        return (*this);
    }

    // here is the interface function
    int &Intarray::operator[] (int index)
    {
        // check for array boundary over/underflow
        if (index < 0 || index >= size)
        {
            printf ("Intarray: boundary overflow or underflow, "
                    "index=%d, should range from 0 to %d\n",
                    index, size - 1);
            exit (1);
        }

        // emit the reference
        return (data [index]);
    }

Cin, cout, cerr their operators

This section describes how a class can be adapted for the usage with the C++ streams cout and cerr and the operator <<. Adaptation of a class for the usage with cin and its operator >> occurs in a similar way and is not illustrated here.

The implementation of an overloaded operator << in the context of cout or cerr involves the base class of cout or cerr, which is ostream. This class is declared in the header file iostream.h and defines only overloaded operator functions for `basic' types, such as, int, char*, etc.. The purpose of this section is to show how an operator function can be defined which processes a new class, say Person (see section Person ) , so that constructions as the following one become possible:

    Person
        kr ("Kernighan and Ritchie", "unknown", "unknown");

    cout << "Name, address and phone number of Person kr:\n"
         << kr
         << '\n';

The statement cout << kr involves the operator << and its two operands: an ostream& and a Person&. The proposed action is defined in a class-less operator function operator<<() expecting two arguments:

    // declaration in, say, person.h
    extern ostream &operator<< (ostream &, Person const &);

    // definition in some source file
    ostream &operator<< (ostream &stream, Person const &pers)
    {
        return (stream << "Name:    " << pers.getname ()
                       << "Address: " << pers.getaddress ()
                       << "Phone:   " << pers.getphone ()
               );
    }

Concerning this function we remark the following:

4.6 Conclusion

Two important extensions to classes have been discussed in this chapter: the overloaded assignment operator and the copy constructor. As we have seen, classes with pointer data which address allocated memory are potential sources of semantic errors. The two introduced extensions are the only measures against unintentional loss of allocated data.

The conclusion is therefore: as soon as a class is defined where pointer data are used, an overloaded assignment function and a copy constructor should be implemented.

Next Chapter, Previous Chapter

Table of contents of this chapter, General table of contents

Top of the document, Beginning of this Chapter