There are already ways in which you can use files for input and output without using any new programming features. For example, you can output to a file using
program1 > file1
to redirect (all of) the program's standard output into the file file1 and you can read those results from the file using
program2 < file1
However, this precludes any user interaction with the program, which might involve displaying questions or a menu, and reading user replies. Further, it does not allow for multiple files to be used.
There are two levels for using file in C; one uses the basic UNIX system calls, and one the stdio (standard input/output) library. We will describe the stdio system first, which is the simplest system for straightforward use.
All file input/output works on the open a file, access it, close it principle.
You need a line
#include "stdio.h"
at the top of your program.
This is the simplest general file input/output system. File descriptors are of type FILE * where FILE is a type which has been #defined in "stdio.h". Don't forget the "*".
To open a file, use the function fopen as in:
#include "stdio.h" FILE *fd; fd = fopen( "filename", "w" ); /* first parameter is file name * as string or "char *" * second parameter is * "r" = read, "w" = write * "a" = append */
The call of fopen checks that your have permission to access the file in the mode that you have requested. If the fopen fails, it delivers a NULL pointer. You should therefore follow the fopen statement by code such as
if ( fd == NULL ) { ... /* error, open failed */ ... exit( 1 ); }
Always check the returned value for NULL before proceeding. The value of NULL is #defined in the header file stdio.h.
When opening a file for reading, the file must exist and you must have read access to it.
When opening a file for writing, [2~if the file exists it is emptied, and you must have write access. If it does not exist, it is created; you must have write access to the directory in which it is to be created.
fprintf( fd, "Value of i is %d\n", i );
This is just like printf, but with an extra file descriptor parameter at the start. It returns the value EOF (another value #defined in stdio.h ) if there was an error (e.g. filesystem full).
You might choose to use the returned value from fprintf, since the print is more likely to fail when performed into a file. Use
if ( fprintf( fd, "Value of i is %d\n", i ) == EOF ) { .... error .... }
To read from an opened file (opened for reading, file descriptor "fd"), use code such as:
fscanf( fd, "%d", &i );
Note that fscanf returns an integer value as for scanf, or EOF to indicate an error.
When you have finished with the file, you must close it. Do this using
fclose( fd );
Any files you forget to fclose will be closed for you when the program terminates. Hoever, you cannot have more than a certain number of open files at any time, so it is good practice to close each file when you have finished accessing it.
To read from file, and write amended values to another, use the outline
FILE *input, *output; input = fopen( ..., "r" ); output = fopen( ..., "w" ); while( fscanf( input, .... ) != EOF ) { .... fprintf( output, .... ); } fclose( input ); fclose( output );
The three standard channels are now stdin for reading keyboard input, stdout for standard output, and stderr for error messages (which will not be redirected in the shell). You do not need to open these three streams. Error messages should be sent to the stderr stream.
if ( ( fd = fopen( file, "r" ) ) == NULL ) { fprintf( stderr, "Cannot open file %s\n", file ); exit( 1 ); }
It is often useful to append new information to existing data in a file. If you open a file for writing, its contents are lost completely.
To append to a file, use
fd = fopen( "filename", "a" );
This will fail if the named file does not already exists, or you do not have write permission to it. Any data already in the file remains. Any further fprintf's which you perform will be appended to the end of the file.
The append mode is not fundamental to UNIX, it is a library function.
In the sdtio library, you can open processes for reading from and writing to. I find this a very useful extension. For input, we could have
FILE *pipout, *pipin; pipin = popen( "who", "r" ); /* we can now read from output of "who" command using fscanf */
For output, we could have
pipout = popen( "lpr", "w" ); /* output using fprintf goes straight to lpr */
or even
pipout = popen( "sort | pr | lpr -Panadex", "w" ); /* output goes to sort, then pr etc */
We could now copy from input (reading the output of the who command) to output (sort and print process) using the following.
char ch; while ( fscanf( pipin, "%c", &ch ) > 0 ) { fprintf( pipout, "%c", ch ); }
At the end, you MUST
pclose( pipin ); pclose( pipout );
These pclose calls cannot be ignored like those of fclose, since it is essential that we send an EOF to the output stream, and that we wait for that process to terminate.
You can move the read/write pointer around within a file using the fseek routine.
long posn, offset; posn = fseek( fd, offset, 0 );
The first parameter is an opened file descriptor. The second is a byte offset. The returned value is the absolute value of the new position in the file. The last (third) parameter is
Thus to move back to the start of the file (perhaps to read it again without closing and opening it) use:
fseek( fd, 0L, 0 );
To skip to the end of the file (perhaps to append additional information, or to find its size) use:
long int size; size = fseek( fd, 0L, 2 );
If the file is composed of (fixed length) records of a particular type of struct, to skip to the start of the (i-1)-th record use
fseek( fd, i * sizeof ???, 0 ); fscanf( fd, ... ); /* to read it */
There are many; I can't remember them all! See the on-line manual man\ stdio, man\ fprintf etc for details.
Updating files is the essence of commercial programming. A file will contain details of
In commerce, each set of related data (one person's record, the data for one type of stock item in the warehouse) is referred to as a "record". Each complete set of related data and the means for accessing it from within a program would be represented by a structure inside a C++ program.
Each day or week or month (or instantly on receipt of an interactive transaction) the file will be updated, and a new file produced. For security reasons, a firm will keep a limited number of old copies of the file together with details of all subsequent transactions, so that the latest file can be re-created if it gets corrupted.
The information inside most files will be held in a definite order, e.g. ordered by personnel works numbers, warehouse stock number, bank customer account number, flight departure time, etc.
A typical program to update a file would, if the file is small,
If all has gone smoothly, the new file is now the master copy, the previous master file becomes the backup copy.
The program sequence might be
Then interact with the user using:
while ( Ask "Any more updates? ", Reply isn't no ) { Ask "person? ", read person Find array subscript for this person Ask "details? ", read details Amend entry values in array of structures } /* end while more updates */
Now finish off with
For larger files, it may not be possible to read the whole file into memory. The program would first order the transactions so that they are in the same order as the entries in the master file; we would assume that the transactions are now held in a file rather than input from a keyboard. We then read the existing master file one entry at a time, see if that entry needs updating, and write that entry to the new master file. In this case both old and new files (and the file of transactions) are open, and only one record is held in the program at a time. The program outline might be as follows.
while ( Not at end of transaction file ) { Read next transaction from transaction file Read records from master file, copying to new master file until this person's record found Check transaction details Amend record values Write this person's new record to the new master file }
Alternatively, the while loop could be controlled by the reading from the input file, as in
while ( Not reached end-of-input-file ) { Read record from existing file if ( not person we're looking for ) { Write record to output file continue } Ask "details? ", read details Amend record values Write this person's record to output file Ask "Next person to search for? " } /* end loop to end of file */
For interactive transactions (such as airline bookings) there must be a way of locking an individual record; is must not be possible for two customers to simultaneously request a spare seat, find that there is one, and attempt to both occupy the same single remaining seat! We are then into a new level of complexity.
We now look at the lower level facilities for file handling. These are provided by system calls to the UNIX kernel.
We now look at the basic system-provided input-output. File descriptors are now integers.
To open a file
int fds; /* file descriptor */ char *file = "/tmp/eric"; /* the filename */ fds = open( file, 0 ); /* 0 = read, 1 = write, 2 = both */ if ( fds < 0 ) exit( 1 ); /* -1 returned if error */
For reading, the file must already exist and have read access.
For writing,the file must already exist and have write access. Note that in the standard file i/o system earlier, if the file did not exist, it was created; that is not the case here.
The integer returned must be retained for future use.
The integer returned is the lowest available channel. To create a file,
fds = creat( file, 0755 );
The second parameter is the file access mode, usually written in octal. If the file already exists with write access, this call will empty it, will not change the permissions, and point to its start.
If the file doesn't exist, it is created (you need write permission to the directory) with mode 0755 octal.
If it exists but does not have write access, or if you do not have write access to the directory, the function fails, and returns the value -1.
If a file is created with access mode 0 (no access permissions), this program can still write to it and read from it. No other program will be able to access it. This is good for temporary work files.
int n; char buffer[100];
n = read( fds, buffer, 100 );
The integer fds is the file descriptor (integer) returned from the open, the second parameter is a pointer (of type "char *") to where the data is to go, the third parameter is the number of bytes requested.
The returned integer (stored in n in this example) is the number of bytes actually read.
To read a character at a time, use
char ch; while ( read( fds, &ch, 1 ) == 1 ) { .... }
This loop terminates at error, or at end of file.
To (binary) read a structure,
read( fds, &object, sizeof object );
The write call parallels the read call.
n = write( fds, buffer, 100 );
To copy one file to another
while( ( n = read( fdsin, buffer, 100) ) > 0 ) { write( fdsout, buffer, n ); }
When you have finished with a data stream
close( fds );
This returns -1 if error. All files are closed on program termination anyway.
The lseek routine lets you move around a file.
long posn, offset; posn = lseek( fds, offset, 0 );
The returned value is the absolute position in the file. The last parameter is
Thus to move back to the start of the file
lseek( fds, 0L, 0 );
To skip to the end of the file
lseek( fds, 0L, 2 );
If the file is composed of (fixed length) records, use
lseek( fds, i * sizeof ???, 0 );
These are UNIX system calls.
unlink( "/tmp/effile" );
You can unlink a temporary file immediately after creating it with creat so that it is (almost) invisible.
The system call
link( "oldname", "newname" );
is the same as the shell command
ln oldname newname
For workfiles invisible to the outside world, use
fds = creat( "/tmp/junk", 0 ); unlink( "/tmp/junk" ); write( fds ... ); .... lseek( fsd, .... ); .... read( fds, ... ); close( fds );
Some programs use the presence or absence of a lock-file to indicate the availability or otherwise of a resource (e.g. a direct link to another machine).
Problem : minimise time between
char *lock = "/tmp/vaxlock"; fds = creat( lock, 0 ); if ( fds < 0 ) { ... exists ... ... wait and loop, or exit ... } else { /* didn't exist, but does now */ ... do work ... unlink( lock ); }
There are problems with root. Use link instead of creat.
Three channels are already open.
write( 1, "message:\n", 9 ); write( 2, "error", 5 );
Diverting standard input to an unopened file.
close( 0 ); fds = open( "new_file", 0 ); /* returns zero */ getchar() ... /* from file */ scanf( ... ) ... /* from file */
Diverting input to an already opened file:
A call of dup gives a second file descriptor pointing to an existing stream. It is rather like a file link.
fds = open( "file", 0 ); ... close( 0 ); fds1 = dup( fds ); /* result is zero */ close( fds ); ... while( ( ch = getchar() ) ) { .... /* continues from current place in file */
Copyright Eric Foxley 1996
Notes converted from troff to HTML by an Eric Foxley shell script, email errors to me!