We're always interested in getting feedback. E-mail us if you like this guide, if you think that important material is omitted, if you encounter errors in the code examples or in the documentation, if you find any typos, or generally just if you feel like e-mailing. Mail to Karel Kubat (karel@icce.rug.nl
) or use an e-mail form . Please state the concerned document version, found in the title. If you're interested in a printable PostScript copy, use the form . or better yet, pick up your own copy viaftp
at ftp.icce.rug.nl/pub/http ,
As we have seen in the previous chapter, C++ provides the tools to derive classes from one base type, to use base class pointers to address derived objects, and subsequently to process derived objects in a generic class.
Concerning the allowed operations on all objects in such a generic class we
have seen that the base class must define the actions to be performed on all
derived objects. In the example of the Vehicle
this was the functionality
to store and retrieve the weight of a vehicle.
When using a base class pointer to address an object of a derived class, the
pointer type (i.e., the base class type) normally determines which actual
function will be called. This means that the code example as from section
VStorage
which uses the storage class VStorage
, will incorrectly
compute the combined weight when a Truck
object (see section
Truck
) is in the storage --- only one weight field, of the cabin part of
the truck, is taken into consideration. The reason for this is obvious: a
Vehicle *vp
calls the function Vehicle::getweight()
and not
Truck::getweight()
; even when that pointer actually points to a
Truck
.
The opposite is however also possible. I.e., C++ makes it possible that a
Vehicle *vp
calls a function Truck::getweight()
when the pointer
actually points to a Truck
. The terminology for this feature of C++
is polymorphism: it is as though the pointer vp
assumes several
forms when pointing to several objects. In other words, vp
might behave
like a Truck*
when pointing to a Truck
, or like an Auto*
when
pointing to an Auto
etc..
(
A second term for this feature is late binding. This name refers to the fact that the decision which function to call (one of the base class or one of the derived classes) cannot be made at compile-time. The right function is selected at run-time.
The default behavior of the activation of a member function via a pointer is
that the type of the pointer determines the function. E.g., a
Vehicle*
will activate Vehicle
's member functions, even when
pointing to an object of a derived class. This is referred to as early or
static binding, since the type of function is known compile-time. The
late or dynamic binding is achieved in C++ with virtual
functions.
A function becomes virtual when its declaration starts with the keyword
virtual
. Once a function is declared virtual
in a base class, its
definition remains virtual
in all derived classes; even when the keyword
virtual
is not repeated in the definition of the derived classes.
As far as the vehicle classification system is concerned (see section
VehicleSystem
ff.) the two member functions getweight()
and
setweight()
might be declared as virtual
. The class definitions
below illustrate the classes Vehicle
(which is the overall base class of
the classification system) and Truck
, which has Vehicle
as an
indirect base class. The functions getweight()
of the two classes are
also shown:
class Vehicle
{
public:
// constructors
Vehicle ();
Vehicle (int wt);
// interface.. now virtuals!
virtual int getweight () const;
virtual void setweight (int wt);
private:
// data
int weight;
}
// Vehicle's own getweight() function:
int Vehicle::getweight () const
{
return (weight);
}
class Land: public Vehicle
{
.
.
}
class Auto: public Land
{
.
.
}
class Truck: public Auto
{
public:
// constructors
Truck ();
Truck (int engine_wt, int sp, char const *nm,
int trailer_wt);
// interface: to set two weight fields
void setweight (int engine_wt, int trailer_wt);
// and to return combined weight
int getweight () const;
private:
// data
int trailer_weight;
};
// Truck's own getweight() function
int Truck::getweight () const
{
return (Auto::getweight () + trailer_wt);
}
Note that the keyword virtual
appears only in the definition of the base
class Vehicle
; it need not be repeated in the derived classes (though a
repetition would be no error).
The effect of the late binding is illustrated in the next fragment:
Vehicle
v (1200); // vehicle with weight 1200
Truck
t (6000, 115, // truck with cabin weight 6000, speed 115,
"Scania", // make Scania, trailer weight 15000
15000);
Vehicle
*vp; // generic vehicle pointer
int main ()
{
// see below (1)
vp = &v;
printf ("%d\n", vp->getweight ());
// see below (2)
vp = &t;
printf ("%d\n", vp->getweight ());
// see below (3)
printf ("%d\n", vp->getspeed ());
return (0);
}
Since the function getweight()
is defined as virtual
, late binding
is used here: in the statements above below the (1)
mark, Vehicle
's
function getweight()
is called. In contrast, the statements under
(2)
use Truck
's function getweight()
.
Statement (3)
however will still lead to a syntax error. A function
getspeed()
is no member of Vehicle
, and hence also not callable via
a Vehicle*
.
The rule is that when using a pointer to a class, only the functions which
are members of that class can be called. These functions can be virtual
,
but this only affects the type of binding (early vs. late).
When functions are defined as virtual
in a base class (and hence in all
derived classes), and when these functions are called using a pointer to the
base class, the pointer as it were can assume more forms: it is polymorph. In
this section we illustrate the effect of polymorphism on the manner in which
programs in C++ can be developed.
A vehicle classification system in C might be implemented with
Vehicle
being a union of struct
s, and having an enumeration field to
determine which actual type of vehicle is represented. A function
getweight()
would typically first determine what type of vehicle is
represented, and then inspect the relevant fields:
typedef enum /* type of the vehicle */
{
is_vehicle,
is_land,
is_auto,
is_truck,
} Vtype;
typedef struct /* generic vehicle type */
{
int weight;
} Vehicle;
typedef struct /* land vehicle: adds speed */
{
Vehicle v;
int speed;
} Land;
typedef struct /* auto: Land vehicle + name */
{
Land l;
char *name;
} Auto;
typedef struct /* truck: Auto + trailer */
{
Auto a;
int trailer_wt;
} Truck;
typedef union /* all sorts of vehicles in 1 union */
{
Vehicle v;
Land l;
Auto a;
Truck t;
} AnyVehicle;
typedef struct /* the data for a all vehicles */
{
Vtype type;
AnyVehicle thing;
} Object;
int getweight (Object *o) /* how to get weight of a vehicle */
{
switch (o->type)
{
case is_vehicle:
return (o->thing.v.weight);
case is_land:
return (o->thing.l.v.weight);
case is_auto:
return (o->thing.a.l.v.weight);
case is_truck:
return (o->thing.t.a.l.v.weight +
o->thing.t.trailer_wt);
}
}
A disadvantage of this approach is that the implementation cannot be easily
changed. E.g., if we wanted to define a type Airplane
, which would, e.g.,
add the functionality to store the number of passengers, then we'd have to
re-edit and re-compile the above code.
In contrast, C++ offers the possiblity of polymorphism. The advantage is
that `old' code remains usable. The implementation of an extra class
Airplane
would in C++ mean one extra class, possibly with its own
(virtual) functions getweight()
and setweight()
. A function like:
void printweight (Vehicle const *any)
{
printf ("Weight: %d\n", any->getweight ());
}
would still work; the function wouldn't even need to be recompiled, since late binding is in effect.
This section briefly describes how polymorphism is implemented in C++. Understanding the implementation is not necessary for the usage of this feature of C++, though it does explain why there is a cost of polymorphism in terms of memory usage.
The fundamental idea of polymorphism is that the C++ compiler does not
know which function to call at compile-time; the right function can only be
selected at run-time. That means that the address of
the function must be stored
somewhere, to be looked up prior to the actual call. This `somewhere' place
must be accessible from the object in question. E.g., when a Vehicle *vp
points to a Truck
object, then vp->getweight()
calls a member
function of Truck
; the address of this function is determined from the
actual object which vp
points to.
The most common implementation is the following. An object which contains virtual functions holds as its first data member a hidden field, pointing to an array of pointers which hold the addresses of the virtual functions. It must be noted that this implementation is compiler-dependent, and is by no means dictated by the C++ ANSI definition.
The table of the addresses of virtual functions is shared by all objects of the class. It even may be the case that two classes share the same table. The overhead in terms of memory consumption is therefore:
A statement like vp->getweight()
therefore first inspects the hidden data
member of the object pointed to by vp
. In the case of the vehicle
classification system, this data member points to a table of two addresses:
one pointer for the function getweight()
and one pointer for the function
setweight()
. The actual function which is called is determined from this
table.
The organization of the objects concerning virtual functions is further illustrated in the following figure:
As can be seen from table
ImplementationFigure
, all objects which
use virtual functions must have one (hidden) data member to address a table of
function pointers. The objects of the classes Vehicle
and Auto
both
address the same table. The class Truck
however introduces its own
version of getweight()
: therefore, this class needs its own table of
function pointers.
Until now the base class Vehicle
contained its own, concrete,
implementations of the virtual functions getweight()
and
setweight()
. In C++ it is however also possible only to mention
virtual functions in a base class, and not define them. The functions are
concretely implemented in a derived class. This approach defines a
protocol, which has to be followed in the derived classes.
The special feature of only declaring functions in a base class, and not defining them, is that derived classes must take care of the actual definition: the C++ compiler will not allow the definition of an object of a class which doesn't concretely define the function in question. The base class thus enforces a protocol by declaring a function by its name, return value and arguments; but the derived classes must take care of the actual implementation. The base class itself is therefore only a model, to be used for the derivation of other classes. Such base classes are also called abstract classes.
The functions which are only declared but not defined in the base class are
called pure virtual functions. A function is made pure virtual by
preceding its declaration with the keyword virtual
and by postfixing it
with = 0
. An example of a pure virtual function occurs in the following
listing, where the definition of a class Sortable
requires that all
subsequent classes have a function compare()
:
class Sortable
{
public:
virtual int compare (Sortable const &other) const = 0;
};
The function compare()
must return an int
and receives a reference
to a second Sortable
object. Possibly its action would be to compare the
current object with the other
one. The function is not allowed to alter
the other
object, as other
is declared const
. Furthermore, the function is not
allowed to alter the current object, as the function itself is declared
const
.
The above base class can be used as a model for derived classes. As an example
consider the following class Person
(a prototype of which was introduced
in section
Person
), capable of comparing two Person
objects by the alphabetical order of their names and addresses:
class Person: public Sortable
{
public:
// constructors, destructors, and stuff
Person ();
Person (char const *nm, char const *add, char const *ph);
Person (Person const &other);
Person const &operator= (Person const &other);
// interface
char const *getname () const;
char const *getaddress () const;
char const *getphone () const;
void setname (char const *nm);
void setaddress (char const *add);
void setphone (char const *ph);
// requirements enforced by Sortable
int compare (Sortable const &other) const;
private:
// data members
char *name, *address, *phone;
};
int Person::compare (Sortable const &o)
{
Person
const &other = (Person const &)o;
register int
cmp;
// first try: if names unequal, we're done
if ( (cmp = strcmp (name, other.name)) )
return (cmp);
// second try: compare by addresses
return (strcmp (address, other.address));
}
Note in the implementation of Person::compare()
that the argument of the
function is not a reference to a Person
but a reference to a
Sortable
. Remember that C++ allows function overloading: a function
compare(Person const &other)
would be an entirely different function
from the one required by the protocol of Sortable
. In the implementation
of the function we therefore cast the Sortable&
argument to a
Person&
argument.
Sometimes it may be useful to know in the concrete implementation of a pure
virtual function what the other
object is. E.g., the function
Person::compare()
should make the comparison only if the
other
object is a Person
too: imagine what the statement
strcmp (name, other.name)
would do when the other
object were in fact not a Person
and
hence did not have a char *name
datamember.
We therefore present here an improved version of the protocol of the class
Sortable
. This class is expanded to require that each derived class
implements a function int getsignature()
:
class Sortable
{
.
.
virtual int getsignature () const = 0;
.
.
};
The concrete function Person::compare()
can now compare names and
addresses only if the signatures of the current and other object match:
int Person::compare (Sortable const &o)
{
register int
cmp;
// first, check signatures
if ( (cmp = getsignature () - o.getsignature ()) )
return (cmp);
Person
const &other = (Person const &)o;
// next: if names unequal, we're done
if ( (cmp = strcmp (name, other.name)) )
return (cmp);
// last try: compare by addresses
return (strcmp (address, other.address));
}
The crux of the matter is of course the function getsignature()
. This
function should return a unique int
value for its particular class.
An elegant implementation is the following:
class Person: public Sortable
{
.
.
// getsignature() now required too
int getsignature () const;
}
int Person::getsignature () const
{
static int // Person's own tag, I'm quite sure
tag; // that no other class can access it
return ( (int) &tag ); // hence, &tag is unique for Person
}
When the operator delete
releases memory which is occupied by a
dynamically allocated object, a corresponding destructor is called to ensure
that internally used memory of the object can also be released. Now consider
the following code fragment, in which the two classes from the previous
sections are used:
Sortable
*sp;
Person
*pp = new Person ("Frank", "frank@icce.rug.nl", "633688");
sp = pp; // sp now points to a Person
.
.
delete sp; // object destroyed
In this example an object of a derived class (Person
) is destroyed using a
base class pointer (Sortable*
). For a `standard' class definition this
will mean that the destructor of Sortable
is called, instead of the
destructor of Person
.
C++ however allows virtual destructors. By preceding the declaration of a
destructor with the keyword virtual
we can ensure that the right
destructor is activated even when called via a base class pointer. The
definition of the class Sortable
would therefore become:
class Sortable
{
public:
virtual ~Sortable ();
virtual int compare (Sortable const &other) const = 0;
.
.
};
Should the virtual destructor of the base class be a pure virtual
function or not? In general, the answer to this question would be no: for a
class such as Sortable
the definition should not force derived
classes to define a destructor. In contrast, compare()
is a pure virtual
function: in this case the base class defines a protocol which must be adhered
to.
By defining the destructor of the base class as virtual
, but not as
purely so, the base class offers the possibility of redefinition of the
destructor in any derived classes. The base class doesn't enforce the choice.
The conclusion is therefore that the base class must define a destructor function, which is used in the case that derived classes do not define their own destructors. Such a destructor could be an empty function:
Sortable::~Sortable ()
{
}
As was previously mentioned in chapter Inheritance it is possible to derive a class from several base classes at once. Such a derived class inherits the properties of all its base classes. Of course, the base classes themselves may be derived from classes yet higher in the hierarchy.
A slight difficulty in multiple inheritance may arise when more than one
`path' leads from the derived class to the base class. This is illustrated in
the code fragment below: a class Derived
is doubly derived from a class
Base
:
class Base
{
public:
void setfield (int val)
{ field = val; }
int getfield () const
{ return (field); }
private:
int field;
};
class Derived: public Base, public Base
{
};
Due to the double derivation, the functionality of Base
now occurs twice
in Derived
. This leads to ambiguity: when the function setfield()
is
called for a Derived
object, which function should that be, since
there are two? In such a duplicate derivation, many C++ compilers will fail to
generate code and (correctly) identify the error.
The above code clearly duplicates its base class in the derivation. Such a
duplication can be easily avoided here. But duplication of a base class can
also occur via nested inheritance, where an object is derived from, say, an
Auto
and from an Air
(see the vehicle classification system, section
VehicleSystem
). Such a class would be needed to represent, e.g., a
flying car
(AirAuto
would ultimately contain two Vehicles
,
and hence two weight
fields, two setweight()
functions and two
getweight()
functions.
Let's investigate closer why an AirAuto
introduces ambiguity, when
derived from Auto
and Air
.
AirAuto
is an Auto
, hence a Land
, and hence a
Vehicle
.
AirAuto
is also an Air
, and hence a
Vehicle
.
The duplication of Vehicle
data is further illustrated in the
following figure:
The internal organization of an AirAuto
is shown in the
following figure:
The C++ compiler will detect the ambiguity in an AirAuto
object, and
will therefore fail to produce code for a statement like:
AirAuto
cool;
printf ("%d\n", cool.getweight());
The question of which member function getweight()
should be called, cannot
be resolved by the compiler. The programmer has two possibilities to resolve
the ambiguity explicitly:
// let's hope that the weight is kept in the Auto
// part of the object..
printf ("%d\n", cool.Auto::getweight ());
Note the place of the scope operator and the class name: before the name
of the member function itself.
getweight()
could be created for
the class AirAuto
:
int AirAuto::getweight () const
{
return (Auto::getweight ());
}
The second possibility from the two above is preferable, since it relieves the
programmer who uses the class AirAuto
of special precautions.
However, besides these explicit solutions, there is a more elegant one. This will be discussed in the next section.
As is illustrated in figure
InternalOrganization
, more than
one object of the type Vehicle
is present in one AirAuto
. The
result is not only an ambiguity in the functions which access the weight
data, but also the presence of two weight
fields. This is somewhat
redundant, since we can assume that an AirAuto
has just one weight.
We can achieve that only one Vehicle be contained in an AirAuto
.
This is done by ensuring that the base class which is multiply present in a
derived class, is defined as a virtual base class. The behavior of
virtual base classes is the following: when a base class B
is a virtual
base class of a derived class D
, then B
may be present in D
but
this is not necessarily so. The compiler will leave out the inclusion of the
members of B
when these are already present in D
.
For the class AirAuto
this means that the derivation of Land
and
Air
is changed:
class Land: virtual public Vehicle
{
.
.
};
class Air: virtual public Vehicle
{
.
.
};
The virtual derivation ensures that via the Land
route, a Vehicle
is
only added to a class when not yet present. The same holds true for the
Air
route. This means that we can no longer say by which route a
Vehicle
becomes a part of an AirAuto
; we only can say that there is
one Vehicle
object embedded.
The internal organization of an AirAuto
after virtual derivation is
shown in the following figure:
Concerning virtual derivation we make the following final remarks:
Land
or
Air
with virtual derivation. That also would have the effect that one
definition of a Vehicle
in an AirAuto
would be dropped. Defining
both Land
and Air
as virtually derived is however by no means
erroneous.
Vehicle
in an AirAuto
is no longer
`embedded' in Auto
or Air
has a consequence for the chain of
construction. The constructor of an AirAuto
will directly call the
constructor of a Vehicle
; this constructor will not be called from
the constructors of Auto
or Air
.
Summarizing, virtual derivation has the consequence that ambiguity in the calling of member functions of a base class is avoided. Furthermore, duplication of data members is avoided.
In contrast to the previous definition of a class such as AirAuto
,
situations may arise where the double presence of the members of a base class
is appropriate. To illustrate this, consider the definition of a Truck
from section
Truck
:
class Truck: public Auto
{
public:
// constructors
Truck ();
Truck (int engine_wt, int sp, char const *nm,
int trailer_wt);
// interface: to set two weight fields
void setweight (int engine_wt, int trailer_wt);
// and to return combined weight
int getweight () const;
private:
// data
int trailer_weight;
};
// example of constructor
Truck::Truck (int engine_wt, int sp, char const *nm,
int trailer_wt)
: Auto (engine_wt, sp, nm)
{
trailer_weight = trailer_wt;
}
// example of interface function
int Truck::getweight () const
{
return
( // sum of:
Auto::getweight () + // engine part plus
trailer_wt // the trailer
);
}
This definition shows how a Truck
object is constructed to hold two
weight fields: one via its derivation from Auto
and one via its own
int trailer_weight
data member. Such a definition is of course valid, but
could be rewritten. We could let a Truck
be derived from an Auto
and from a Vehicle
, thereby explicitly requesting the double
presence of a Vehicle
; one for the weight of the engine and cabin, and
one for the weight of the trailer.
A small item of interest here is that a derivation like
class Truck: public Auto, public Vehicle
is not accepted by the C++ compiler: a Vehicle
is already part of an
Auto
, and is therefore not needed. An intermediate class resolves the
problem: we derive a class TrailerVeh
from Vehicle
, and Truck
from Auto
and from TrailerVeh
. All ambiguities concerning the
member functions are then be resolved in the class Truck
:
class TrailerVeh: public Vehicle
{
public:
TrailerVeh (int wt);
};
TrailerVeh::TrailerVeh (int wt)
: Vehicle (wt)
{
}
class Truck: public Auto, public TrailerVeh
{
public:
// constructors
Truck ();
Truck (int engine_wt, int sp, char const *nm,
int trailer_wt);
// interface: to set two weight fields
void setweight (int engine_wt, int trailer_wt);
// and to return combined weight
int getweight () const;
};
// example of constructor
Truck::Truck (int engine_wt, int sp, char const *nm,
int trailer_wt)
: Auto (engine_wt, sp, nm), TrailerVeh (trailer_wt)
{
}
// example of interface function
int Truck::getweight () const
{
return
( // sum of:
Auto::getweight () + // engine part plus
TrailerVeh::getweight () // the trailer
);
}
Next Chapter, Previous Chapter
Table of contents of this chapter, General table of contents
Top of the document, Beginning of this Chapter