DOC HOME SITE MAP MAN PAGES GNU INFO SEARCH PRINT BOOK
 
The Design of C++ Standard Components

Techniques for speeding execution

This section describes a few of the implementation techniques we used for speeding execution. In terms of complexity theory, these are all techniques for reducing the constant multiplier mentioned in the last section.

Some of these techniques are well known, and some are not-so-well known.

Using inline functions

A well-known technique for speeding execution is to make short functions inline. A not-so-well-known technique is to partition complicated functions into two parts: a simple case, which is done inline, and a complicated case, which requires a function call. We have made an effort do do this wherever possible. Consider the reserve() function, used to guarantee the validity of a Block index in the Block(3C++) component:

       Block.h
           template <class T> class Block {
               int n;
               unsigned grow(unsigned);
               ...
           public:
               int reserve(unsigned k) {
                   return k<n || grow(k);
               }
               ...
           };

Whenever k is strictly less than n (which is probably most of the time), reserve() is fast since the call to grow() is short-circuited. Only when the test fails is the call necessary. This is very important to client programs, which are encouraged to call reserve() prior to every assignment:

       myprog.c
           #include <Block.h>
           main() {
               int i = 0;
               Block<int> b;
               while (some unpredictable condition ) {
                   b.reserve(i);
                   b[i++] = 1;
               }
           }

Using copy-avoidance techniques

Another well-known technique for speeding execution is to avoid writing functions that return large data structures:

       struct Big {
           lots of stuff
       };
       Big foo() {
           Big b;
           modify b        return b;
       }

When such a function is used to initialize a Big variable, an expensive copy takes place:

       Big b = foo();   // expensive copy

Although it is possible for C++ translators to recognize this situation and optimize away the copy (this is called ``return value optimization''), not all C++ translators do so. The most common copy-avoidance technique is to return the result through a reference argument:

       void foo(Big& b) {
           modify b        return;
       }

This has two problems. First, there is the cost (borne by the client) of constructing an empty Big object, only to have it immediately overwritten by the first call to foo(). Second, the solution leads to unnatural syntax for functions most naturally expressed as operators:

       Instead of this (which makes a copy):
           Big operator+(const Big& b1,const Big& b2);
       We would have to do this (to avoid copy):
           void plus(const Big& b1,const Big& b2, Big& result);

Instead, we use a different technique which allows functions to return Big values without copying them. The technique exploits the fact that most C++ translators will do the ``return value optimization'' if foo() has the following form:

       Big foo() {
           return constructor;}

To implement the Big addition operator, for example, we define a private constructor that does the work of addition. Then the addition operator (which must be a friend in order to invoke the private constructor) merely invokes the constructor in a return statement:

       class Big {
           friend Big operator+(const Big& b1,
               const Big& b2);
       private:
           Big(const Big& b1,const Big& b2);
       public:
           ...
       };
   

Big operator+(const Big& b1,const Big& b2) { return Big(b1,b2); }

Typically, there must be one such private constructor for each function that returns a Big value. The following trick makes sure that each of these constructors has a unique signature:

       class Big_plus;
       class Big_minus;
       class Big {
           friend Big operator+(const Big& b1,
               const Big& b2);
           friend Big operator-(const Big& b1,
               const Big& b2);
       private:
           Big(const Big& b1,const Big& b2,Big_plus* p);
           Big(const Big& b1,const Big& b2,Big_minus* p);
       public:
           ...
       };
       Big operator+(const Big& b1,const Big& b2) {
           return Big(b1,b2,(Big_plus*)0);
       }
       Big operator-(const Big& b1,const Big& b2) {
           return Big(b1,b2,(Big_minus*)0);
       }

Ugly? Yes. But fortunately, you don't have to worry about any of these details. You can write expressions like this

       Big b1, b2, b3;
       ...
       Big b4 = b1 + b2 - b3;

with relative assurance that b4 is computed without any copies.

Avoiding init objects

Init objects have become a standard programming technique for solving the initialization problem. This problem can be stated as follows:

   After a line of the form #include <T.h>, a client should be
   able to use any of the facilities declared in T.h.

For example, clients should be able to do this:

       client.c
           #include <String.h>
           String s = "hello, world!";

Unfortunately, guaranteeing this can be tricky for the library implementor. The problem is that the String constructor may depend on certain initializations having been performed. For example:

       String.h
           class String {
               static Pool* pool;
               ...
           public:
               String();
               ...
           };
       String.c
           #include <String.h>
           Pool String::pool = new Pool(...);
           String::String(const char* s) {
               do something with String::pool
           }

Here the constructor for String requires that String::pool already be initialized. Unfortunately, this may not be the case at the time the constructor for s in the client's program is called. It will be the case if and only if static initialization of String.c is performed prior to static initialization of client.c. But C++ does not guarantee any particular order of file initializations (it only guarantees that within the same file, static objects are initialized from top to bottom).

Init objects are one attempt at solving this problem. Here is how the init object scheme would work for the above example. In String.h, place the following prior to the declaration of class String:

       String.h
           class String_init_object {
               static int inited;
               int inited_by_me;
               String_init_object();
               ~String_init_object();
           };
           static String_init_object string_init_object;
           class String {...};

This ordering guarantees that any file which includes String.h will initialize the string_init_object before it initializes any String object; this is so because a declaration of a String object cannot precede the class definition for class String. We leave it to the String_init_object constructor to take care of the necessary initializations:

       int String_init_object::inited = 0;
       String_init_object::String_init_object() :
           inited_by_me(0) {
           if (!inited) {
               inited = 1;
               inited_by_me = 1;
               String::pool = new Pool(...);
           }
       }

and for the String_init_object destructor to do any necessary finalizations:

       String_init_object::~String_init_object() {
           if (inited_by_me) {
               delete String::pool;
           }
       }

After using the init object scheme in earlier (internal) versions of C++ Standard Components, we eventually abandoned it. The problem is that use of init objects increases the complexity, both space and time, of client programs. For every .c file which either directly or indirectly includes T.h, a String_init_object is defined, increasing the size of the executable. Furthermore, if N files include String.h, then the String_init_object constructor will be called N times - only one of which does any real work. Finally, on virtual memory systems, every page containing an init object is touched on program startup. In the extreme case, the entire program may be brought into primary memory just to perform the initialization of static objects.

Instead of init objects, we modify the class String member functions to do the initialization themselves:

       String.c
           Pool String::pool = 0;
           String::String() {
               if (pool == 0)
               pool = new Pool(...);
               ...
           }

The key to doing this correctly is to make sure the initialization is done in a place (or places) that guarantees the variable is always initialized before any possible use. For this the library implementor must do some mental dataflow analysis. Furthermore, so that they do not degrade runtime performance, the checks must be done in places where their runtime is percentage-wise insignificant. For example, in the String constructor, the time spent in testing pool == 0 is insignificant compared to the time spent in doing the rest of the work of the constructor.

More about avoiding init objects

In the above section, we explained why we do not use init objects in C++ Standard Components. It would be equally disastrous (in terms of size and execution time of client programs) for one of our library header files to #include a non-library header file containing an init object. For example, the following is no good:

       String.h
           #include <iostream.h>
           class String {...};
           inline ostream& operator<<(ostream& os,
             const String& s) {
               os << (const char*)s;
           }

Since some implementations of iostream.h have an init object, this is unacceptable. Instead, we forward declare the necessary classes:

       String.h
           class ostream;
           class String {...};
           // Note that the following function
           // is no longer inline
           ostream& operator<<(ostream& os,
               const String& s);

This also has the advantage of reducing compilation time (see the section ``Compilation time'' in this chapter). The only cost to the client is a slight runtime penalty for the out-of-line call to the insertion operator.

Not checking for client errors

This topic was covered at length in the section ``Treatment of errors'' in the chapter ``Getting Started''. In the context of the current discussion, imposing preconditions on the use of a function improves the execution speed of the function by eliminating the need for checking. For example, when the String(3C++) manpage says

       char operator[](unsigned i);
   

Returns character number i. Precondition: i must be less than the length of the String.

it means that the operation does not check the condition, and failure to satisfy it will result in unspecified (and usually disastrous) behavior. The alternative (checking the index) would impose an unacceptable runtime overhead on every reference.


Next topic: Avoiding language features with execution speed penalties
Previous topic: Algorithms and data structures

© 2004 The SCO Group, Inc. All rights reserved.
UnixWare 7 Release 7.1.4 - 27 April 2004