CSC 357 Lecture Notes Week 5
More on Files and Directories Function Pointers in C Standard I/O Library System Data Files and Information



  1. Relevant reading:
    1. Stevens Chapters 4, 5, 6.
    2. K&R Section 5.11.
    3. Man pages for referenced functions.

  2. Reading directories (Stevens Section 4.21).
    1. As noted at the end of last week's notes, there is a useful example on pp. 121-125.
      1. It shows the use of the functions opendir, readdir, and closedir.
      2. It also illustrates the use of the directory information structure struct dirent.
    2. The signatures of directory-related system functions are:
      DIR* opendir(const char *filename);
      
      struct dirent *readdir(DIR* dirp);
      
      int closedir(DIR* dirp);
      

    3. The DIR type is internal, similar to FILE.
    4. struct dirent has programmer-accessible information.
      1. It's definition is implementation-dependent.
      2. On falcon/hornet, it's in <sys/dirent.h>:
    5. Here's the falcon/hornet definition of the dirent structure:
      typedef struct dirent {
         ino_t d_ino;     /* inode numberof entry */
         off_t d_off;     /* offset of disk directory entry*/
         short d_reclen;  /* len of this record */
         char d_name[];   /* name of file */
      } dirent_t;
      

  3. The chdir, fchdir, and getcwd functions (Stevens Section 4.22).
    1. They are defined in <unistd.h>.
    2. Signatures:
      int chdir(const char *pathname);
      
      int fchdir(int fildes);
      
      char* getcwd(char* buf, size_t size);
      
    3. chdir and fchdir change the current working directory of a process, i.e., a running program.
    4. getcwd returns the path of the current working directory.
    5. chdir is the system-level analog of the user-level "cd" command.
    6. getcwd is the system-level analog of the user-level "pwd" command.

  4. The ftw and nftw functions.
    1. On page 121, Stevens notes these directory traversing functions.
      1. They are defined in the so-called XSI extensions of POSIX, and are available in most current UNIX systems, including falcon/hornet, Mac OS X, and Linux.
      2. The normal place for the header definitions is <ftw.h>.
      3. You may use nftw in your solution to Programming Assignment 3, if you like.
    2. Signatures:
      int ftw(const char *path,
              int (*fn) (const char*, const struct stat*, int),
              int depth);
      
      int nftw(const char *path,
               int (*fn) (const char*, const struct stat*, int, struct FTW*),
               int depth,
               int flags);
      
      1. The path argument is the root of the path that the functions recursively traverse.
      2. The fn argument is the function that is called as each element of the directory hierarchy is visited; the default visiting order is a pre-order traversal.
      3. The depth argument is an estimate the depth of the traversed directory tree; it is physically the maximum number of file descriptors the functions can keep open.
      4. For nftw, the fourth flags argument controls aspects of function operation, including the traversal order and whether symbolic links are followed; see the man page for details.
    3. Example:
      #include <stdio.h>
      #include <ftw.h>
      
      /****
       *
       * Example use of nftw.
       *
       */
      
      /**
       * The visit function is called by nftw for each element of a traversed
       * directory hierarchy.  It prints out the path of the element being visited,
       * and its mode as an octal value.  The first three octits of the mode are the
       * file type, with 040 indicating a directory, and 100 a plain file.  The last
       * three octits are the file permissions, e.g., 644 = "rw-r--r--".
       */
      int visit(const char* path, const struct stat* stat, int flags, struct FTW* ftw) {
          printf("path=%s, mode=%o\n", path, stat->st_mode);
          return 0;
      }
      
      /**
       * The main function just calls nftw, catching any error it might return.
       */
      main() {
          if (nftw(".", visit, 10, 0) != 0) {
              perror("nftw");
          }
      }
      
    4. The second argument to ftw and nftw is a function pointer, which topic we need to discuss further.

  5. Review of function pointers in C.
    1. A function pointer in C is just what the term says -- a pointer to a function.
      1. A function pointer value can be treated like other values.
      2. I.e., it can be assigned, placed in an array, passed to other functions, returned by functions.
    2. The primary utility of function pointers is as parameters to other functions.
      1. This allows the called function to perform different work based on the function it is passed.
      2. For example, the second argument to the ftw functions allows their user to define exactly what happens when each element of a directory hierarchy is visited.
    3. The subject of function pointers is covered in Section 5.11 of K&R, on pages 118-121.
      1. An example illustrates the use of a function pointer as a parameter to a sorting function.
      2. Specifically, the function parameter is the comparison function used in the body of the sort.
      3. Making this a function pointer allows different implementations of the comparison to be provided, similar to the way the Object->compareTo method has different implementations in Java.
    4. Here is a variant of the K&R example.
      #include <stdio.h>
      #include <unistd.h>
      #include "std-macros.h"
      
      /**
       *
       * This program exercises the system qsort function.  The focus is on the
       * comparator functions that qsort calls, as an illustration of using function
       * pointers in C.
       *
       */
      
      
      /**
       * Compare two void* values as ints, by casting them and using normal numeric
       * comparison.
       */
      int intcmp(const void* a1, const void* a2) {
          int v1 = *((int*) a1);
          int v2 = *((int*) a2);
      
          if (v1 < v2) return -1;
          if (v1 > v2) return 1;
          return 0;
      }
      
      /**
       * Like intcmp, but reverses the sense of the comparison.  This allows the
       * sorting order to defined as descending, without changing the implementation
       * of the sort function.
       */
      int intcmp_reverse(const void* a1, const void* a2) {
          int v1 = *((int*) a1);
          int v2 = *((int*) a2);
      
          if (v1 > v2) return -1;
          if (v1 < v2) return 1;
          return 0;
      }
      
      /**
       * Compare two void* values as numeric strings, by casting them to char*, then
       * converting them using atoi, then using normal numeric comparison.
       *
       * Note here the intermediate use of char** as a cast, then the deference down
       * to char*.  I.e,. the following code is used to cast the incoming void*
       * argument a1 into the char* variable s1:
       *
       *     char* s1 = *((char**) a1);
       *
       * The deal is that qsort works with pointers to the array values it's sorting,
       * not the values themselves.  In this case, the array being sorted contains
       * char* values.  This means that qsort is working with element pointers of
       * type char**, since it's pointing to each char* element of the arrays.
       * Hence, this function receives values of type char**, carried in the generic
       * pointers of type void*.
       */
      int str_intcmp(const void* a1, const void* a2) {
          char* s1 = *((char**) a1);
          char* s2 = *((char**) a2);
          int v1 = atoi(s1);
          int v2 = atoi(s2);
      
          if (v1 < v2) return -1;
          if (v1 > v2) return 1;
          return 0;
      }
      
      
      /**
       * The main function defines unsorted int and string arrays.  It then calls the
       * system qsort function to sort the int array in ascending and descending
       * orders, using the preceding two int comparison functions.  It also calls
       * qsort to sort the string array in ascending order, using str_intcmp.
       */
      main() {
          int i;
          int int_data[6] = {1, 8, 3, 4, 2, 1};
          char* str_data[6] = {"1", "8", "3", "4", "2", "1"};
          size_t int_nelems = sizeof(int_data) / sizeof(int);
          size_t str_nelems = sizeof(str_data) / sizeof(int);
      
          /*
           * Print the int array before sorting.
           */
          for (i = 0; i < int_nelems; i++) printf("%d ", int_data[i]);
          printf("\n");
      
          /*
           * Sort the int array using intcmp comparator and print results.
           */
          qsort((void*) int_data, int_nelems, sizeof(int), intcmp);
          for (i = 0; i < int_nelems; i++) printf("%d ", int_data[i]);
          printf("\n");
      
          /*
           * Sort the int array using intcmp_reverse comparator and print results.
           */
          qsort((void*) int_data, int_nelems, sizeof(int), intcmp_reverse);
          for (i = 0; i < int_nelems; i++) printf("%d ", int_data[i]);
          printf("\n");
      
      
          /*
           * Print the string array before sorting.
           */
          for (i = 0; i < str_nelems; i++) printf("%s ", str_data[i]);
          printf("\n");
      
          /*
           * Sort the string array using str_intcmp comparator and print results.
           */
          qsort((void*) str_data, str_nelems, sizeof(char*), str_intcmp);
          for (i = 0; i < str_nelems; i++) printf("%s ", str_data[i]);
          printf("\n");
      
      }
      
    5. The declaration of the system qsort function shows how a function-pointer parameter is declared:
      void qsort(void* base, size_t nel, size_t width,
                 int (*compar)(const void*, const void*));
      
      1. The fourth parameter declares a pointer to a function that takes two constant void* arguments and returns an int.
      2. The parentheses around the parameter name are critical, since without them the type would be
        int *compar(const void*, const void*));
        
        which declares that compar is a function returning a pointer to an int, which is a different thing.
    6. We can now revisit the declaration of the second parameter to the nftw function, which is
      int (*fn) (const char*, const struct stat*, int, struct FTW*)
      
      1. This is a pointer to a function of four arguments, returning an int.
      2. The declaration of the visit function is consistent with this parameter type, i.e.,
        int visit(const char* path, const struct stat* stat, int flags, struct FTW* ftw)
        
    7. The syntax of function pointer types is rather tricky, and takes some getting used to.
      1. I find that using typedefs makes function pointer declarations more readable.
      2. E.g., here is a definition of the qsort signature using a type definition for the function parameter:
        typedef int (* CompareFunc)(const void*, const void*);
        void qsort(void* base, size_t nel, size_t width, CompareFunc compar);
        
      3. The following is an equivalent definition:
        typedef int CompareFunc(const void*, const void*);
        void qsort(void* base, size_t nel, size_t width, CompareFunc* compar);
        
      4. It is interesting to note that the following declaration is syntactically legal, but effectively useless, since it declares the function parameter to be a plain function rather than a pointer to a function:
        void qsort(void* base, size_t nel, size_t width, CompareFunc compar);
        
        1. The reason it's useless is that there is no way to create a plain function value in C, only a function pointer value.
        2. Specifically, the name of a declared function is a pointer to the function, not plain function data (whatever that might look like).
  6. Other file- and directory-related topics (Stevens Sections 4.4 - 4.20).
    1. These sections of Stevens describe additional file and directory functions, none of which is necessary in Programming Assignment 3.
    2. There is useful explanatory information relevant to the assignment, including the following worth noting:
      1. The explanation of file access permissions in Section 4.5.
      2. The explanation of file size in Section 4.12.
      3. The explanation of symbolic links in Section 4.16.
      4. The explanation of file times in Section 4.18
      5. The summary of file access permission bits in Section 4.24
    3. You should also read and understand Sections 4.4 and 4.14 which present information we'll use later in the quarter.

  7. Introduction to Buffered Standard I/O (Section 5.1).
    1. Spec'd by ISO C standard.
    2. Implemented in many OSs other than UNIX.

  8. Streams and Files (Section 5.2).
    1. Chapter 3 functions centered on file descriptors.
    2. Standard I/O lib centered on streams.
    3. Streams are associated with fopen'd files.
    4. Streams can be one-char-per byte ASCII or multi-byte "wide" char sets.
    5. Applications should never need to examine FILE objects.
    6. Rather, FILE* pointers are passed to all functions that deal with stdio streams.

  9. Std In, Out, and Err (Sect 5.3).
    1. As we've seen, these are pre-assigned streams.
    2. The POSIX-def'd file descriptors are STDIN_FILENO, STDOUT_FILENO, and STDERR_FILENO.
    3. The FILE* stream names are stdin, stdout, and stderr.

  10. Buffering (Sect 5.4)
    1. The goal of buffering is to minimize read/write calls.
    2. Three types of buffering are provided:
      1. Fully buffered.
      2. Line buffered.
      3. Unbuffered.
    3. Observations:
      1. Buffered streams typically associated with files, with programmer-malloc'd buffer.
      2. Buffered streams can be flushed with fflush.
      3. Line buffered streams are typically associated with terminal devices.
      4. Stderr is normally unbuffered.
      5. ISO C requires stdin and stdout to be fully buffered unless on interactive device.
      6. ISO C says stderr is never fully buffered.
    4. Buffering scheme can be changed with setbuf system function.
    5. setbuf is called after stream is open.
    6. As noted above, fflush flushes buffer output.

  11. Opening a Stream (Sect 5.5).
    1. You've already used fopen and fclose.
    2. There are also freopen and fdopen.
      1. freopen closes if already open
      2. fdopen associates an existing fd with a stream.
    3. Pages 138 and 139 cover details of the ways files can be opened.

  12. Reading and Writing a Stream (Sect 5.6).
    1. There are three types of I/O:
      1. char-at-a-time
      2. line-at-a-time
      3. direct
    2. Stream i/o functions return EOF on end of file or error.
    3. To distinguish eof versus error, there are boolean-valued ferror and feof system functions.
    4. You've already used getc, fgetc, and getchar.
    5. These provide char-at-a-time input.
    6. The output analogs are putc, fputc, and putchar.

  13. Line-at-a-Time I/O (Sect 5.7).
    1. The functions are fgets, and fputs.
    2. Read (again) the admonishment on Page 142 about never using gets.

  14. Std I/O Efficiency (Sect 5.8).
    1. Stevens presents a timing example analogous to the one in Chapter 3.
    2. The results are that buffered I/O is slower across the board than unbuffered.
    3. Observations:
      1. Line-at-a-time is faster than may be expected because it uses memcpy.
      2. Other interesting details about "system" calls versus "function calls".

  15. Binary I/O (Sect 5.9).
    1. The buffered analogs of read and write are fread and fwrite.
    2. These are most commonly used for array and structure I/O, where size says exactly how many bytes to read or write.
    3. A practical issue is the non-portable nature of binary data.
      1. Different compilers may treat struct offsets differently.
      2. Different computer architectures may vary in storing multi-byte ints and floats.

  16. Positioning a Stream (Sect 5.10).
    1. The analog of lseek is fseek.
    2. The ftell stream function is the analog of an interrogating call to lseek.
    3. There is also the stream rewind function.
    4. For text files, ISO C supports fgetpos and fsetpos.
    5. These are preferable for text files, since low-level byte offset of fseek is not really of interest.

  17. Formatted I/O (Sect 5.11).
    1. You've been here plenty.
    2. Output functions are printf, fprintf, sprintf, snprintf.
    3. You may be using sprintf for Program 3.
    4. The tables on Page 150 summarize "%" formating options.
    5. There are variable-length-argument versions of print functions, names prefixed with "v".
    6. These are handy in ways we'll see in coming weeks.
    7. Formatted intput provided by the scanf series.
    8. These can be tricky to use, but are helpful in reading structured input from stdin streams.

  18. Implementation Details (Sect 5.12).
    1. Std I/O functions call unbuffered I/O functions presented in Chapter 3.
    2. The fd associated with a stream is available via fileno function.
    3. Example on Pages 154-155 illustrates implementation issues.

  19. Temporary Files (Sect 5.13).
    1. The functions tmpnam and tmpfile are useful for creating tempoary files.
    2. The files have unique names, up to a TMP_MAX limit, which varies for ISO C and POSIX (at least 25 versus at least 10,000).

  20. Alternatives to Std I/O (Sect 5.14).
    1. A potential inefficiency of Std I/O lib is extra data copying.
    2. More efficient implementations pass pointers in places where copying can be avoided.
    3. There are also implementations specialized for embedded systems that require a small memory footprint.

  21. Intro to System Data Files and Information (Sect 6.1).
    1. Like any OS, UNIX uses lots of system files.
    2. The root of the file system is "/".
    3. There are a number of typical subdirs, including /etc/, bin, /usr, /dev, /var.

  22. Password File (Sect 6.2).
    1. Standard place is /etc/passwd
    2. System program access through struct passwd
    3. It's in <pwd.h>.
      struct passwd {
              char    *pw_name;
              char    *pw_passwd;
              uid_t   pw_uid;
              gid_t   pw_gid;
              char    *pw_age;
              char    *pw_comment;
              char    *pw_gecos;
              char    *pw_dir;
              char    *pw_shell;
      };
      
    4. Access functions are getpwname, getpwid.
    5. Signatures:
      struct passwd *getpwnam(
          const char *name);
      
      struct passwd *getpwuid(
          uid_t uid);
      
    6. See Pages 162 - 164 for details.

  23. Shadow Passwords (Sect 6.3).
    1. Used to strengthen encryption.
    2. Can be subsumed by NIS, LDAP access.

  24. Group File (Sect 6.4).
    1. Structure defined in <grp.h>.
    2. Defines group as list of user names.
    3. Access functions are getgrnam, getgrgid.

  25. Supplementary Group IDs (Sect 6.5).
    1. Defined in /etc/group.
    2. Allows user to belong to multiple groups.

  26. Implementation Differences (Sect 6.6).
    1. Falcon/hornet use NIS+.
    2. There is also LDAP.
    3. These provide access to common network password file.

  27. Other Data Files (Sect 6.7).
    1. Various network services are common.
    2. /etc/services, /etc/protocols, /etc/networks.

  28. System Identification (Sect 6.9).
    1. POSIX requires uname function.
    2. Signature:
      int uname(struct utsname *name);
      
    3. struct utsname defined in <sys/utsname.h>
      struct utsname {
              char    sysname[_SYS_NMLN];
              char    nodename[_SYS_NMLN];
              char    release[_SYS_NMLN];
              char    version[_SYS_NMLN];
              char    machine[_SYS_NMLN];
      };
      



index | lectures | labs | programs | handouts | solutions | examples | documentation | bin