Chapter 8 : File Input and Output

Chapter 8 : File Input and Output

Contents

8.1. Introduction

There are already ways in which you can use files for input and output without using any new programming features. For example, you can output to a file using

to redirect (all of) the program's standard output into the file file1 and you can read those results from the file using

However, this precludes any user interaction with the program, which might involve displaying questions or a menu, and reading user replies. Further, it does not allow for multiple files to be used.

There are two levels for using file in C; one uses the basic UNIX system calls, and one the stdio (standard input/output) library. We will describe the stdio system first, which is the simplest system for straightforward use.

8.2. The C standard file i/o system

All file input/output works on the open a file, access it, close it principle.

8.2.1. Opening a file

You need a line

at the top of your program.

This is the simplest general file input/output system. File descriptors are of type FILE * where FILE is a type which has been #defined in "stdio.h". Don't forget the "*".

To open a file, use the function fopen as in:

The call of fopen checks that your have permission to access the file in the mode that you have requested. If the fopen fails, it delivers a NULL pointer. You should therefore follow the fopen statement by code such as

Always check the returned value for NULL before proceeding. The value of NULL is #defined in the header file stdio.h.

When opening a file for reading, the file must exist and you must have read access to it.

When opening a file for writing, [2~if the file exists it is emptied, and you must have write access. If it does not exist, it is created; you must have write access to the directory in which it is to be created.

This is just like printf, but with an extra file descriptor parameter at the start. It returns the value EOF (another value #defined in stdio.h ) if there was an error (e.g. filesystem full).

You might choose to use the returned value from fprintf, since the print is more likely to fail when performed into a file. Use

8.2.2. Reading from a file

To read from an opened file (opened for reading, file descriptor "fd"), use code such as:

Note that fscanf returns an integer value as for scanf, or EOF to indicate an error.

8.2.3. Closing the file

When you have finished with the file, you must close it. Do this using

Any files you forget to fclose will be closed for you when the program terminates. Hoever, you cannot have more than a certain number of open files at any time, so it is good practice to close each file when you have finished accessing it.

To read from file, and write amended values to another, use the outline

8.2.4. Standard input and output

The three standard channels are now stdin for reading keyboard input, stdout for standard output, and stderr for error messages (which will not be redirected in the shell). You do not need to open these three streams. Error messages should be sent to the stderr stream.

8.2.5. Appending to a file

It is often useful to append new information to existing data in a file. If you open a file for writing, its contents are lost completely.

To append to a file, use

This will fail if the named file does not already exists, or you do not have write permission to it. Any data already in the file remains. Any further fprintf's which you perform will be appended to the end of the file.

The append mode is not fundamental to UNIX, it is a library function.

8.2.6. Pipes in STDIO

In the sdtio library, you can open processes for reading from and writing to. I find this a very useful extension. For input, we could have

For output, we could have

or even

We could now copy from input (reading the output of the who command) to output (sort and print process) using the following.

At the end, you MUST

These pclose calls cannot be ignored like those of fclose, since it is essential that we send an EOF to the output stream, and that we wait for that process to terminate.

8.2.7. Moving around inside a file

You can move the read/write pointer around within a file using the fseek routine.

The first parameter is an opened file descriptor. The second is a byte offset. The returned value is the absolute value of the new position in the file. The last (third) parameter is

Thus to move back to the start of the file (perhaps to read it again without closing and opening it) use:

To skip to the end of the file (perhaps to append additional information, or to find its size) use:

If the file is composed of (fixed length) records of a particular type of struct, to skip to the start of the (i-1)-th record use

8.2.8. Other stdio routines

There are many; I can't remember them all! See the on-line manual man\ stdio, man\ fprintf etc for details.

Updating files is the essence of commercial programming. A file will contain details of

  • all personnel, pay to date, tax to date, tax codes, etc
  • all stock in the warehouse, current and minimum levels, etc
  • all bank accounts, the owner, the balance, the maximum debt, etc
  • all flights by the airline, booked and free seats, destination, timing, etc

    In commerce, each set of related data (one person's record, the data for one type of stock item in the warehouse) is referred to as a "record". Each complete set of related data and the means for accessing it from within a program would be represented by a structure inside a C++ program.

    Each day or week or month (or instantly on receipt of an interactive transaction) the file will be updated, and a new file produced. For security reasons, a firm will keep a limited number of old copies of the file together with details of all subsequent transactions, so that the latest file can be re-created if it gets corrupted.

    The information inside most files will be held in a definite order, e.g. ordered by personnel works numbers, warehouse stock number, bank customer account number, flight departure time, etc.

    8.3.1. Updating small files

    A typical program to update a file would, if the file is small,

  • 1 read the whole of the latest master file into an array
  • 2 interact with the user (or use information stored in a data file) to update the various entries in the array (to add this week's pay, to decrement or increment the current warehouse stock values, to change the current credit in the bank accounts, to reserve a seat on a flight)
  • 3 write the updated information stored in the array into a new file.

    If all has gone smoothly, the new file is now the master copy, the previous master file becomes the backup copy.

    The program sequence might be

    Then interact with the user using:

    Now finish off with

    8.3.2. Updating large files

    For larger files, it may not be possible to read the whole file into memory. The program would first order the transactions so that they are in the same order as the entries in the master file; we would assume that the transactions are now held in a file rather than input from a keyboard. We then read the existing master file one entry at a time, see if that entry needs updating, and write that entry to the new master file. In this case both old and new files (and the file of transactions) are open, and only one record is held in the program at a time. The program outline might be as follows.

    Alternatively, the while loop could be controlled by the reading from the input file, as in

    8.3.3. Interactive transactions

    For interactive transactions (such as airline bookings) there must be a way of locking an individual record; is must not be possible for two customers to simultaneously request a spare seat, find that there is one, and attempt to both occupy the same single remaining seat! We are then into a new level of complexity.

    8.4. Basic UNIX file I/O system calls

    We now look at the lower level facilities for file handling. These are provided by system calls to the UNIX kernel.

    8.4.1. Opening a file

    We now look at the basic system-provided input-output. File descriptors are now integers.

    To open a file

    For reading, the file must already exist and have read access.

    For writing,the file must already exist and have write access. Note that in the standard file i/o system earlier, if the file did not exist, it was created; that is not the case here.

    The integer returned must be retained for future use.

    The integer returned is the lowest available channel. To create a file,

    The second parameter is the file access mode, usually written in octal. If the file already exists with write access, this call will empty it, will not change the permissions, and point to its start.

    If the file doesn't exist, it is created (you need write permission to the directory) with mode 0755 octal.

    If it exists but does not have write access, or if you do not have write access to the directory, the function fails, and returns the value -1.

    If a file is created with access mode 0 (no access permissions), this program can still write to it and read from it. No other program will be able to access it. This is good for temporary work files.

    8.4.2. Reading from the file

    The integer fds is the file descriptor (integer) returned from the open, the second parameter is a pointer (of type "char *") to where the data is to go, the third parameter is the number of bytes requested.

    The returned integer (stored in n in this example) is the number of bytes actually read.

    To read a character at a time, use

    This loop terminates at error, or at end of file.

    To (binary) read a structure,

    8.4.3. Writing

    The write call parallels the read call.

    To copy one file to another

    8.4.4. Close

    When you have finished with a data stream

    This returns -1 if error. All files are closed on program termination anyway.

    8.4.5. Random Access to a File

    The lseek routine lets you move around a file.

    The returned value is the absolute position in the file. The last parameter is

    Thus to move back to the start of the file

    To skip to the end of the file

    If the file is composed of (fixed length) records, use

    8.4.6. Removing a File

    These are UNIX system calls.

    You can unlink a temporary file immediately after creating it with creat so that it is (almost) invisible.

    8.4.7. Creating a link

    The system call

    is the same as the shell command

    8.4.8. Invisible Temporary Files

    For workfiles invisible to the outside world, use

    8.4.9. Lock files

    Some programs use the presence or absence of a lock-file to indicate the availability or otherwise of a resource (e.g. a direct link to another machine).

    Problem : minimise time between

  • a) Does the file exist?
  • b) If not, create it.

    There are problems with root. Use link instead of creat.

    8.4.10. Default input/output channels

    Three channels are already open.

    8.4.11. Diverting Standard Input

    Diverting standard input to an unopened file.

    Diverting input to an already opened file:

    A call of dup gives a second file descriptor pointing to an existing stream. It is rather like a file link.

    Copyright Eric Foxley 1996


    Notes converted from troff to HTML by an Eric Foxley shell script, email errors to me!