CSC 357 Lecture Notes Week 1, Part 2
Function Declarations versus Definitions Pointers and Arrays in C



  1. The distinction between C function declaration and definition.
    1. A declaration is the signature only of a function, e.g.,
      char* match(char* pattern, char* target);
      

      1. Such declarations typically appear in .h files, details of which are discussed in Lecture Notes Week 2.
      2. Strictly speaking, a function declaration does not need parameter names, since it defines just the type signature.
      3. E.g., the following declaration has the same semantics as the preceding
        char* match(char*, char*);
        

        though it is typical to include names for clarity.
    2. The definition of a function defines its body, i.e., the code between the curly braces.

  2. Declare-before-use policy in C programs.
    1. Before a function is called in a C program, the compiler must have seen at least its declaration.
    2. This is achieved by putting function declarations before function definitions in the program.
    3. Declare-before-use can also be achieved without function declarations, by ordering the function definitions appropriately, i.e., by defining a function lexically before all other functions that call it.
    4. For functions that are called recursively, using a function declaration is mandatory for mutually referential functions, e.g.,

      char* match(char* pattern, char* target);
      

      ...

      char* match_start(char* pattern, char* target) {
      ...
      matched = match(...);
      ...
      }

      ...

      char* match(char* pattern, char* target) {
      ...
      match_start(pattern, target);
      ...
      }

    5. When a function is used before it is declared (or defined) the compiler will assume that the function return type and all parameters types are int.
      1. These assumptions are very often wrong.
      2. If the function call violates these assumptions, the compiler will issue warnings to that effect, often confusingly so.
      3. E.g., if the match declaration was missing in the preceding example, the compiler would issue error messages of the form:
        sgrep.c: In function 'match_start':
        sgrep.c:56: warning: assignment makes pointer from integer without a cast
        sgrep.c: At top level:
        sgrep.c:98: error: conflicting types for 'match'
        sgrep.c:56: error: previous implicit declaration of 'match' was here
        

  3. C's memory model.
    1. Computer memory in a C program is directly accessible to the programmer.
    2. When you declare an array of 10 charss, for example, the compiler allocates a block of memory that is exactly 10 bytes of contiguous characters.
    3. There is no "veil" over memory in C, as there is in Java and other high-level languages.
      1. Metaphorically, a veil obscures certain features, leaving them to the viewer's imagination.
      2. This is the case in Java's memory model, where the programmer cannot see exactly what's going on in JVM memory.
      3. A string of 10 characters in Java is definitely not a simple memory block of 10 bytes.
    4. In C, the programmer is often keenly aware of the machine-level structure of memory.

  4. Pointers and arrays (K&R chapter 5).
    1. A pointer in C is a memory address.
    2. An array in C is a block of memory, to which a pointer can point.
    3. Hence, pointers and arrays in C are closely related.

  5. Pointers and addresses (K&R Section 5.1).
    1. Memory in C is laid out as consecutively numbered cells, each number being a memory address.
      1. Most typically, the smallest sized cell is one char, also called a "byte".
      2. Cells are typically grouped into bigger segments, often referred to as "words".
        1. For example, a pair of one-byte cells is a short word.
        2. Four adjacent cells form a long word.
        3. A pointer is typically a long word, and it holds the value of a memory address.
          1. This is the case with so-called 32-bit computer architectures.
          2. On such machines, the size of a memory address is 32 bits = 4 8-bit bytes = 1 long word.
    2. Memory viewed as byte-addressable words:

    3. Variables in C occupy some place in memory, and hence have an address.
      1. C provides the '&' operator to get the address of (i.e., a pointer to) any variable.
      2. C also provides the pointer dereferencing operator '*', to access the value that a pointer points to.
      3. The '*' symbol is also used to declare a variable as a pointer.
    4. Here are the code examples from Pages 94 and 95 of K&R:
      int x = 1, y = 2, z[10];
      int *ip;         /* ip is a pointer to int */
      int *iq;         /* iq is another pointer to int */
      
      ip = &x;         /* ip now points to x */
      y = *ip;         /* y is now 1 */
      *ip = 0;         /* x is now 0 */
      ip = &z[0];      /* ip now points to z[0] */
      
      *ip = *ip + 10;  /* increments the value ip points to by 10 */
      y = *ip +1 ;     /* accesses what ip points to and adds 1 to it */
      *ip += 1;        /* uses the += operator to increment *ip */
      iq = ip;         /* assigns the pointer value of ip to iq */
      

      While these examples are rather artificial in terms of meaningful C programs, they illustrate the fundamentals of how '&' and '*' are used.
    5. The size of memory segment to which a pointer refers is constrained by its type declaration.
      1. E.g., an int pointer points to a 4-byte word (assuming that's how big an int is).
      2. A char pointer points to a single byte.
      3. These constraints impact the way address arithmetic is performed, as we'll see shortly.

  6. Pointers and function arguments (K&R Section 5.2).
    1. Function arguments are passed by value.
    2. Call-by-reference parameter passing can be achieved using '*' and '&'.
    3. Consider the versions of the swap function on pages 95 and 96 of K&R:
      void swap(int x, int y) {   /* WRONG, in that swap(a,b) does not swap vars a and b */
          int temp;
      
          temp = x;
          x = y;
          y = temp;
      }
      
      versus
      void swap(int* x, int* y) {   /* CORRECT, in that swap(&a,&b) does swap a and b */
          int temp;
      
          temp = *x;
          *x = *y;
          *y = temp;
      }
      
    4. See the picture on Page 96 of K&R.

  7. Pointers, arrays, and address arithmetic (K&R Section 5.3).
    1. There is a strong relationship between pointers and arrays in C; specifically:
      1. Given the declarations
          int a[10]
          int *pa;
          
        the following assignments are legal
          pa = a;
          pa = &a[0];
          
        and have exactly the same effect, which is that the pointer pa is assigned to point to the zeroth element of array the a.
      2. Given the preceding declarations of a and pa, the following equivalences hold for all 0 <= i < 10:
        a[i] = *(pa + i)
        pa[i] = *(a + i)
        &a[i] = pa + i
      3. Used with formal parameters in a function definition, the declarators t[] and t* denote the same type.
        1. t[] is an array of undetermined length, holding elements of type t.
        2. t* is a pointer to a block of memory, each segment of which is of type t.
    2. The following picture illustrates the pointer/array relationships just described:

  8. Character pointers and functions(K&R Section 5.5).
    1. C has double-quoted string constants like "Hello world\n", used as a function argument to printf.
    2. String constants can be used to initialize char* and char[] string variables, as in
        char amessage[] = "now is the time";    /* a char array */
        char* pmessage = "now is the time";     /* a char pointer */
        
      1. As explained on Page 104 of K&R, amessage and pmessage are not equivalent definitions.
      2. amessage is an array of exactly strlen("now is the time")+1 elements, whose character elements can be changed, but amessage as a whole cannot be reassigned.
      3. pmessage is a pointer to a block of characters, changing the characters of which is undefined in C, but pmessage as a whole can be reassigned.
      4. The following lines of code illustrate the differences between the two string variables:
          amessage[2] = 't';                  /* OK */
          pmessage = "another message";       /* OK */
        
          amessage = "another message";       /* incompatible assignment types */
          pmessage[2] = 't';                  /* undefined behavior */
          
    3. There are a number of library functions that allow strings to be operated on at the character level, including: strcpy, strcat, and strcmp.
      1. K&R discusses implementations of these on pages 105-107.
      2. You should read the man page descriptions of these and related string- processing functions, using the command man string(3C) on falcon/hornet.



index | lectures | labs | programs | handouts | solutions | examples | documentation | bin