Chapter 8 : File input/output

Chapter 8 : File input/output

It is comparatively easy to perform simple serial input and output using files in C++. We will teach only the straightforward aspects.

Contents

8.1. Introduction

Perhaps we should first remember that input and output can be diverted from and to files in operating systems such as UNIX or MS-DOS without any program changes, provided that we are talking about all of the input or all of the output. Thus we can write

to run the program prog32 taking all of its input from the file data_file. This is the basis of the way your programs are tested for dynamic correctness in the Ceilidh command against various sets of test data.

Conversely, we can write

to store all of the (standard) output in the named file. This is how your program output is saved when Ceilidh tests the program. The dynamic testing system then searches the saved output for the keywords or phrases which the teacher has specified.

The error output (anything using the cerr stream) would still come to the terminal, and would not be diverted to the file.

This simple approach for diverting program input or output will satisfy some simple situations where we wish to use files instead of the terminal for input or output. However, for most realistic applications, we will need to communicate with standard input and output (the user's terminal) as well as one or more files. We will generally need to interact with a user (print a prompt such as "When does the person arrive at Nottingham station?" on the terminal, and read the reply from the terminal) as well as interacting with the files of data (reading from a file containing the actual train timetables, and perhaps writing to a file of seat reservations).

8.2. The basics of file access

The approach to file input/output in most modern shared computer systems involves the program (i.e. any process wishing to access a file) in the following sequence of activities.

Files are made under UNIX and MS-DOS to appear very similar to any other form of serial input/output device. Thus in the ordinary command shell, output can be directed to the terminal, or to a file, or to another process, with no fundamental change to the process producing the output. The operating system is hiding from the user what are in reality great differences between the physical processes of sending characters to a terminal and sending them to file.

A typical C++ program to send output to a file is

The declaration of an ofstream requests an "output file stream", to be called fout. The call of fout.open then asks that it be connected to the file "/tmp/data_out_file". If the request is successful, then "fout" will appear to us just like "cout" for performing output, but output will go to the file, not the terminal. The particular identifier chosen ("fout" in this case) is entirely up to the programmer. Before proceeding any further with the program, we must check that the named file was opened successfully; this is done with the

part of the code. There is no point in the program continuing if the open was not successful. At this point we would typically print a message to cerr and exit with non-zero status.

Having opened our output file stream successfully, we then write data and strings to the file using exactly the same techniques as we have been using for cout, but using the output stream identifier fout instead.

We do not need to close the stream when we have finished with the file, but can leave closing to performed automatically during the end of program activities. It is however generally good practice to close files no longer in use. The status of the close should also be checked with the fout.fail() command.

Reading from a file

If we had wanted to read from the file, we might have written

e start by declaring an ifstream (input file stream) and using "fin.open( f )" to connect if to file f. We again use fin.fail() to check that the open was successful. If the request is successful, then fin becomes an input stream to treated just like cin. In this case the file is presumed to contain a series of integer values terminated by a zero. It could have been created using a text editor such as "vi" or "emacs", or as the output of an earlier program using

or by a program using it as an output file stream directly.

The choice of identifier "fin" is up to you.

Simultaneously open files

We can have a need for several files to be open at the same time, typically at least one for reading and one for writing. There will usually be an upper limit imposed by the system, perhaps 20 files.

In UNIX the file close takes places automatically when the program terminates if the file has not previously been closed. In other systems, information being written to a file may be lost if the file is not closed properly.

8.4. The use of files in general

Updating files is the essence of commercial programming. A file will contain details of

  • all personnel, pay to date, tax to date, tax codes, etc
  • all stock in the warehouse, current and minimum levels, etc
  • all bank accounts, the owner, the balance, the maximum debt, etc
  • all flights by the airline, booked and free seats, destination, timing, etc

    In commerce, each set of related data (one person's record, the data for one type of stock item in the warehouse) is referred to as a "record". Each complete set of related data and the means for accessing it from within a program would be represented by a structure inside a C++ program.

    Each day or week or month (or instantly on receipt of an interactive transaction) the file will be updated, and a new file produced. For security reasons, a firm will keep a limited number of old copies of the file together with details of all subsequent transactions, so that the latest file can be re-created if it gets corrupted.

    The information inside most files will be held in a definite order, e.g. ordered by personnel works numbers, warehouse stock number, bank customer account number, flight departure time, etc.

    Updating small files

    A typical program to update a file would, if the file is small,

  • 1 read the whole of the latest master file into an array (open for reading, read the entire file, and then close it)
  • 2 interact with the user (or use information stored in a data file) to update the various entries in the array (interact using "cin" and "cout") (to add this week's pay, to decrement or increment the current warehouse stock values, to change the current credit in the bank accounts, to reserve a seat on a flight)
  • 3 write the updated information stored in the array into a new file (open for writing, write the entire file, and close).

    If all has gone smoothly, the new file is now the master copy, the previous master file becomes the backup copy.

    The program sequence might be

    Then interact with the user using:

    Now finish off with

    Updating large files

    For larger files, it may not be possible to read the whole file into memory. The program would first order the transactions so that they are in the same order as the entries in the master file; we would assume that the transactions are now held in a file rather than input from a keyboard. We then read the existing master file one entry at a time, see if that entry needs updating, and write that entry to the new master file. In this case both old and new files (and the file of transactions) are open, and only one record is held in the program at a time. The program outline might be as follows.

    Alternatively, the while loop could be controlled by the reading from the input file, as in

    Interactive transactions

    For interactive transactions (such as airline bookings) there must be a way of locking an individual record; is must not be possible for two customers to simultaneously request a spare seat, find that there is one, and attempt to both occupy the same single remaining seat! We are then into a new level of complexity.

    Copyright Eric Foxley Tue Dec 3 09:33:09 GMT 1996


    Notes converted from troff to HTML by an Eric Foxley shell script, email errors to me!