Introduction to Programming in C
Brief Summary Notes by Eric Foxley
You will be given duplicated notes each week as part of the programming course. You should supplement these by reading your own books, or books in the library.
There are many languages for programming computers; you may perhaps have used one already, at school or on a home computer. Which one have you used at school? BASIC? Pascal? None?
Old-fashioned serial languages include
There are now also many fourth generation languages or 4GLs, aimed at retrieving information from modern databases.
These languages are all essentially the same, they just involve different syntactic sugar , i.e. they are written using different formats, punctuation and grammars. The differences exist because the languages are aimed at different classes of users. For example those aimed at commercial users (COBOL, PL/I) will use more words and less symbols; we might see
rate_per_hour multiplied by hours_worked gives gross_pay
in COBOL, compared with
pay = rate * hours
in a language for engineers.
Languages aimed at scientific numerical users (FORTRAN, Pascal) offer facilities for simplifying the handling of vectors and matrices, and for performing very accurate arithmetic.
In addition to being aimed at different categories of user, languages may have other different objectives, such as aiming particularly at beginners (BASIC, Pascal), or being particularly safe for large projects (ADA, Modula2).
All of these languages are a formal notation for you to give sequential instructions to the computer. The instructions in all of the above languages are sequential ("Do this, then do that ...") and explicit.
There are other very different types of languages such a Prolog (taught at Nottingham to Computer Science 2nd years) which are of a completely different nature. In Prolog for example you essentially describe your problem, and leave the computer to decide how best to solve it.
A typical BASIC or APL program is interpreted; that is, each line is decoded and interpreted by the computer each time it is executed. An instruction which occurs inside a repeated loop may have to be interpreted many times. This wastes computer time, and causes the program to run relatively slowly. It has the questionable advantage that parts of the program which are not executed do not need to be interpreted, so that an incorrectly typed line may not be detected until it comes to be executed.
Programs in most of the other languages mentioned above are compiled; that is, the whole program is first analysed and digested by a compiler , which converts it into a machine executable form. This machine executable form runs much faster than an interpreted program, since all of the analysis of the program statements has been completed before any execution starts. If you request it, the compiler will spend additional time making the compiled program as efficient as possible.
The compilation is often in two distinct stages, first compiling your program into an object module, and then loading the object module into an executable program. At the loading stage, items from a library to perform certain standard operations (to handle networks, or to draw pictures, for example) may be combined with the program object module when it is loaded.
Language interpreters usually include some simple form of editor, such as the line numbering system in BASIC. Every line of the program starts with a number; the number determines the choice of a line to change, and the sequence in which lines are executed. Compiled languages use programs which have been stored in ordinary text files. You can use any editor to create the original text source file; the most commonly available general UNIX text editor is vi , the most commonly used editor on SUN computers is "emacs" . In the Nottingham University Mathematics Department the preferred editor is the local extension by Dr Walker of the ed editor. Your environment variable EDITOR should be set to refer to your preferred editor.
Brian Kernighan et al (as they developed the UNIX computer operating system) required a language for writing a computer system. At that time (about 1970) most systems programs were written in assembly language for reasons of efficiency. This involves expressing the problem in terms of very low-level machine operations on a particular type of computer. The resulting program is long, tedious, error prone, non-portable and difficult to change. Brian Kernighan and his colleagues realised that the use of Assembler was to be avoided at all costs.
A language called BCPL had been developed in the UK specifically for writing computer systems. Brian Kernighan's first attempt at a language was based on BCPL, adding various features to make it more useful, and was called "B".
After some more experience, they decided that a new language was needed. A new language was therefore developed, and was called "C". This was used to write the next version of UNIX system software, which eventually became the world's first portable operating system.
C has now become a widely used professional language for various reasons.
Its main drawback is that it has poor error detection.
The standard for C programs was originally the features set by Brian Kernighan. In order to make the language more internationally acceptable, an international standard was developed, ANSI C (American National Standards Institute).
Another group developed C to reflect modern developments in program design, in particular object-oriented programming. This language became "C++". C++ may be considered in several ways.
It is high time that we stopped talking about programming languages, and saw an actual C program; the simplest possible program might be as follows.
/* Sample minimal program */ /* from EF's C notes */ main() { printf( "Hi there!\n" ); exit ( 0 ); } /* end of program */
The text of the program as shown here would be stored in a file. Try the very first exercise [press here] to see this program.
The stages of developing your C program are as follows.
Create a file containing the complete program, such as the above example, using any ordinary editor with which you are familiar such as vi or "emacs" or "ed" .
The filename must by convention end ".c" (full stop, lower case c), e.g. "myprog.c" or "progtest.c" . The contents must be as in the above example, starting with the line
/* Sample ....
or a blank line preceding it, and ending with the line
} /* end of program */
or a blank line following it.
Compile your program with the command
cc program.c
where program.c is the name of the file.
If there are obvious errors in your program (such as typing
main((
instead of
main()
or misspelling one of the key words or omitting a semi-colon), the compiler will detect and report them. The compiler will tell you the number of the line where it detected the error; this may not be the line on which the error occurs, it is often the following line! Usually only the first reported error is significant; later error messages may be spuriously generated by the first genuine error. If errors occur, you must correct them using the editor, and then call the compiler again, and repeat this process until no errors are reported.
The compiler's error messages contain the word Error. There may be other messages from the compiler containing the word Warning. These represent constructions in your program which the compiler thinks suspicious. You do not have to correct them, but you should make sure that they are not significant.
There may, of course, still be logical errors that the compiler cannot detect. You may be telling the computer to do the wrong operations.
When the compiler has successfully digested your program, the compiled version, or executable , is left in a file called "a.out" . After a successful compilation, execute the command
ls -l
to see that a file "a.out" exists and has execute permission. Observe its size, and compare it with that of the original program source.
The next stage is to actually run your executable program. To run an executable in UNIX, you simply type the name of the file containing it, in this case
a.out
or perhaps
./a.out
This executes your program, printing any results to the screen. At this stage there may be run-time errors, such as division by zero, or it may become evident that the program has produced incorrect output. If so, you must return to edit your program source, and recompile it, and run it again. For any serious program, the testing process must be thorough, and will involve careful planning of the number of tests required, and the test data chosen to exercise each part of the program.
Once you have an executable program in the file "a.out" , you can use "a.out" as just another UNIX command.
Any input requested by the program must be typed at the keyboard; there is no built-in prompt as in some other languages; the program just halts until you have typed the required input, terminated by end-of-line.
To take your input data from a file called input_file , use the symbol "<" in the shell and type for example
a.out < input_file
Any program output normally comes to the screen; to send output to a file output_file use the symbol ">" in the shell, and type for example
a.out > output_file
This overwites any information which was previously in the file output_file with the new output. To append the new output to any information already in the file without overwriting, use the ">>" operation as in
a.out >> output_file
To pipe output into another program use for example
a.out | wc
to see how many lines, words and characters are in the program output.
It may be more convenient to use a "-o" and filename in the compilation as in
cc -o program program.c
which puts the compiled program into the file "program" (or any file you name following the "-o" argument) instead of putting it in the file "a.out" .
You can then run it using the command
program
Alternatively you could have used the earlier compilation command, and then executed
mv a.out program
to rename the "a.out" file.
On some machines, instead of
cc prog.c
you can use
make prog
to compile the program in prog.c. This leaves the executable program in a file prog instead of in a.out. You then type
prog
to run it. For full use of the make command see elsewhere.
You can now keep several different compiled programs. Beware though, they all occupy disc space, and you have an upper limit on your total disc occupation. It is best to delete executables that you do not require; they can always be quickly regenerated by compilation if you need them.
A compiler is itself just a program, albeit a large and complex one. Bear in mind that there may be minor differences between our compiler and ones on, for example, Computing Centre machines, or on your own PC.
There will be at least one programming exercise every week, which you may complete at any time you like within the set time limit. Use the "course summary" csum to keep yourself informed of exercises and submission dates. All of these exercises will be assessed, and the results form the basis of your end-of-course mark.
You will be expected to use the Ceilidh system for reading the coursework definition each week, and for marking your results. The intervening operations of editing, compilation and test runs can be performed either inside or outside the Ceilidh system.
The simplest option is, at least early in the course, to perform all your programming work inside Ceilidh. In Ceilidh, each course is divided into a number of units. Each week you will be asked to complete certain exercises in certain units. Inside the Ceilidh system, you perform all operations by choosing items from menus, and will typically work in the following sequence.
To work outside the Ceilidh system, first use Ceilidh to read the coursework definition ("vq") and to set up your skeleton program ("set"). Then leave the system ("q"), and use vi or "emacs" to edit, a command such as "cc -o prog11 prog11.c" to compile, and "prog11" to run your program, directly in your UNIX shell as required. Then return to Ceilidh to issue your marking ("mk") command to mark and submit the work. At some stage of the course, you may be asked to show your working program to a member of staff.
You must ALWAYS at some stage use the mark and submission command mk in Ceilidh to show that you have completed the work. If you forget this, the teacher will have no evidence that you have done the work. Do NOT hand sheets of paper to anyone, all work is monitored on-line.
The favourite book for keenies is that by the designers of C, Brian Kernighan and Dennis Ritchie [ref 1] B. W. Kernighan, P. J. Plauger, The Elements of Programming Style, McGraw-Hill, New York (1974) The C Programming Language 2nd Ed, Brian W Kernighan, Dennis M Ritchie, Prentice-Hall (1988) Alfred V Aho, Brian W Kernighan, Peter J Weinberger, Awk - A Pattern Scanning and Processing Language, Programmer's, Manual A T & T Bell Laboratories (1985) P. J. Plauger, Brian W Kernighan, Software Tools in Pascal, (1981) Addison-Wesley Brian W Kernighan, A Troff Tutorial, A T & T Bell Laboratories (1990) Brian W Kernighan, PIC - A Graphics Language for Typesetting, A T & T Bell Laboratories (1984) Jon L Bentley, Brian W Kernighan, Tools for Printing Indexes, (1986) A T & T Bell Laboratories Jon L Bentley, Brian W Kernighan, GRAP - A Language for Typesetting Graphs, A T & T Bell Laboratories B. W. Kerninghan, D. M. Ritchie, The C Programming Language, Prentice-Hall, Inc. (1978) The C Programming Language 2nd Ed, Brian W Kernighan, Dennis M Ritchie, Prentice-Hall (1988) but make sure you get the second edition. It assumes that you have programmed in other languages. For beginners, there are other books which take the subject a little more gently, such as those by Barclay [ref 2] C Problem Solving and Programming, Kenneth A Barclay, Prentice-Hall (1989) or Hanly [ref 3] Hanly JR, Koffman EB, Friedman FL, Problem Solving and Program Design in C, Addison-Wesley Pub. Co. (1992) or Tizzard [ref 4] C for Professional Programmers, Keith Tizzard, Ellis Horwood (1986) or Masters [ref 5] David Masters, C An Introduction with Advanced Applications, Prentice-Hall (1991) or those by Kelley and Pohl [ref 6] Al Kelley, Ira Pohl, C by Dissection, Benjamin Cummings (1992) [ref 7] Al Kelley, Ira Pohl, C by Dissection, Benjamin Cummings (1992) or Hutchinson and Just [ref 8] Programming Using the C Language, Robert C Hutchison, Steven B Just, McGraw-Hill (1988) or Waite et al [ref 9] Mitchell Waite, Stephen Prata, Donald Martin, C Primer Plus, Howard W Sams (1984) The two books published by the Que Corporation [ref 10] C Programming Guide 3rd Ed, Jack J Purdon, Que Corporation [ref 11] Advanced C, Sobleman, Krekleberg, Que Corporation are expensive, but very thorough, and full of worked examples.
We will assume that you are familiar with UNIX.
Choose any book with which you feel happy. The notes supplied with this course should be fairly comprehensive.
You may need to contact the course teacher at some point. Eric Foxley works jointly between the Computer Science and Mathematics departments; he may therefore be difficult to locate! He may be available in his Computer Science office (Tower building floor 11 room 1102, internal phone 4210) part of the time, or in his Mathematics department office (Maths/Physics building top floor room C107, internal phone 4953). To find which office he is in at any given time, you will of course use
rwho | grep ef
to see if ef is logged on, to which machine, and from where. It may be more convenient to use electronic mail to "ef" or the co (comment) facility in Ceilidh.
Details of this week's coursework are available from the "course summary" csum facility in Ceilidh.
This section is for future information.
The C preprocessor is a separate program (usually in a file such as /lib/cpp), which is quite useful in its own right. The compiler passes your program through this preprocessor before compiling it. The facilities of the preprocessor are detailed below.
#include "file.h" /* include the named source file here */ #include "sub.c" /* ".c" for C source */ /* ".h" for header file */ #include <stdio.h> /* include from /usr/include/stdio.h */
The form used in the third example <...> searches a standard system area for the file. The other "..." form searches first the current directory, then the include directories.
Convention is that ".h" files (header files) include no code generation; see later for the significance of this.
Include directories can also be specified at the command level (so that different versions can be compiled with different included files).
Examples are
cc -o prog -I/usr/lib/include prog.c cc -o prog.exe -I/usr/vms/include prog.c
Both fixed and parameterised macros are available.
/* a fixed definition */ #define VAT 0.15 #define MAX 100 /* every future occurrence of "VAT" is substituted by "0.15" */
tax = price * VAT; /* compiler sees "tax = price * 0.15;" */
if( n > MAX ) ... /* compiler sees "n > 100" */
Beware of
#define COST base + part /* COST will be replaced by "base + part" */ price = COST * number /* compiler sees "base + part * number" */
Use
#define COST (base + part) /* to avoid operator problems */
/* a parameterised definition */ #define MAX( X, Y ) ( X > Y ? X : Y ) p = MAX( x / y, y / x ); /* the compiler sees "p = ( x / y > y / x ? x / y : y / x );" */
The same problems arise as mentioned above. A better definition is
#define MAX( X, Y ) ( (X)>(Y) ? (X) : (Y) )
Note the brackets in the following.
#define PRINT printf X ;
The statements
PRINT( ( "hello" ) ) PRINT( ( "%d", x ) )
are now expanded to
printf( "hello" ) printf( "%d", x )
If we redefine PRINT as empty by
#define PRINT
the statements expand to nothing, i.e they vanish.
Somewhere near the start we may have
#define 68040
Later on we may now use
#ifdef 68040 ... /* this code included only if 68040 is defined */ #else ... /* this included otherwise */ #endif
#ifndef 68040 ... /* this code included only if 68040 NOT defined */ #endif
To cancel a definition
#undef 68040 /* to unset the definition */
For two versions of a program (a standard version and a master version with extra facilities) use
#ifdef MASTER ... #endif
Programs wishing to be portable should set standard defines such as "SUN3", "UNIX", "VAX" etc., usually one for the processor, one for the system, e.g.
#ifdef SUN3 && UNIX
Defines of both the above types are available at the command parameter level as follows:
cc -DVAT=0.15 prog.c cc -D68040 prog.c
These are not important at the moment, but you may find them useful later.
cc *.c | text in several files |
cc -o progex prog.c | named executable file |
executable goes into "progex" | |
cc -O ... | capital letter O, optimise |
takes longer to compile | |
but produces more efficient program | |
cc -S ... | leave assembler code in .s |
cc -s ... | strip relocation data |
cc -c ... | compile only, to .o |
then use the "ld" command | |
cc ... -lm | look in library m |
Examples of complete commands
cc -o prog -O -s prog.c -lm cc -c *.c # compile several to .o files ld *.o # link and load them to a.out cc prog.c *.o # compile prog.c, link with *.o
Useful UNIX commands related to C programming include
lint prog.c | comment on C source |
cc -p prog.c | run-time profiling |
xref -c prog.c | cross ref'ce listing |
make | system maintenance |
SCCS | Source code control system |
RCS | SUN Revision Control System |
adb | debugging |
diff | textual differences |
grep | searching for text patterns |
cb | C program beautifier |
indent | ditto |
Copyright Eric Foxley 1996
Notes converted from troff to HTML by an Eric Foxley shell script, email errors to me!