Some basic 'C' Programming

by
Wenton L. Davis

Here are some tips and tricks in C programming. There are plenty of "Learn to Program in C" books and web pages out there, this is just more of a reference for some of the more obscure things I've found that are not as aeasy to find.

Pointers to Functions

In some cases, it is necssessary to work with functions in ways where you may be more interested in the location (address in memory) of a function. (Lots of puritanist CS hard-core people just shit their pants. Deal.) Working with shared/dynamic libraries, depending on Windows or Linux, are one application that can take great advantage of these function pointers. Many highly optimised operations like state-machines can also be greatly improved this way. Suppose you have two functions:

int detect_neg( int test )
{
  int result=0;
  
  if( test < 0 )
  {
    printf( "Transition to negative detected\n" );
    result = 1;
  }
  
  return result;
}

int detect_pos( int test )
{
  int result=0;
  
  if( test > 0 )
  {
    printf( "Transition to positive detected\n" );
    result = 1;
  }
  
  return result;
}

Basically, detect_neg() looks to see if a number is less than 0, and detect_pos() looks to see if a number is greater than 0. So far, this looks like something that would be more easily handled with a simple if-then-else structure, right? But what if the data we are looking at is more like an audio sample, and we only care about the transitions, crossing the 0V line? We don't want output from every sample, just the transitions. so if our code can look at the flow of data, and only look for one or the other, we can write something like this:

int (*detector)(int) = detect_neg;

int stream_of_integers( void )
{
  who knows what might be in here.... 
}

int main()
{
  int sign=0; //0 means we are reading positive numbers, 1 means negative
  int change;
  
  while( 1 )
  {
    change = detector( stream_of_integers() );
    if( change )
    {
      if( sign == 0 ) //have we been reading positive numbers?
      {
        detector = detect_pos;
        sign = 1;
      }
      else //nope, we've been reading negative
      {
        detector = detect_neg;
        sign = 0;
      }
    }
  }
}

Now we are watching a stream of integers; as long as we keep reading positive integers, we are only looking for the first negative number. Once a negative number is detected, the pointer in detector() is changed to the function that only looks for positive numbers.

OK, admittedly, that wasn't a real exciting example. As an alternative, let's look at a state machine, making decisions based on an input character:

int (*state)(unsigned char) = state0;

int state0( unsigned char in )
{
  //if 0, assume no valid data, wait here for valid data
  //if 0 0 ) && ( in < 0x10 ) ) state = state1;
  if( ( in >= 0x10 ) && ( in < 0x20 ) ) state = state2;
  if( ( in >= 0x20 ) && ( in < 0x80 ) ) state = state3;
  if( in > 0x80 ) state = state4;
  
  return 0;
}

int state1( unsigned char in )
{
  //collect characters until a zero is found, then return to state 0
  if( in > 0 )
  {
    //do something with the characters
    //not necessary, but to explicitely stay in state1:
    state = state1;
  }
  else //found a NULL character
  {
    state = state0;
  }
  
  return 0;
}

int state2( unsigned char in )
{
  //if I detect a capitol letter, enter state 5
  //if I detect a lower case letter, enter state 6
  //if I detect a numerical digit, enter state 7
  //anything else, return to state 0
  if( ( in >= 'A' ) && ( in <= 'Z' ) ) state = state5;
  else if( ( in >= 'a' ) && ( in <= 'z' ) ) state = state6;
  else if( ( in >= '0' ) && ( in <= '9' ) ) state = state7;
  else state = state0;
  
  return 1;
}

etc...

int main()
{
  int code;
  int need_more=0;
  
  while( 1 )
  {
    if( need_more ) code = getch();
    need_more = state( code );
  }
  
  return 0;
}

This is something that could be really useful in a compiler or ASCII text processor. State 0 basically sits there, waiting for a non-NULL character to arrive, and when it does, enter a new state, determined by what that character was. Notice that it returns a 0, basically indicating that it has completed processing on that character. State 1 looks at a character and determines to go to state 5, 6, or 7 if it is uppercase, lowercase, or digit, respectively, or return to state 0 otherwise. Notice that state 1 returns a 1, indicating that this character may be needed to determine further processing.

Next, looking at the loop in main(), this is a very simple loop, requiring olmost no logical processing of the data; the hardest thing the main loop has to do is decide whether or not to continue processing the same character, or to get another character to continue with.

There are a few advantages to this use. The first one is that main() is very simple. It isn't polluted with complex decision making code. Second, the individual staten() functions remain fairly simple as well, only dealing with one character at a time, and making most decisions about what to expect to follow. This kind of state machine is very common in compilers - both for lexical and syntax analysis.

The only real trick to using code like this is figuring out how to define the variable. Notice how the variable pointer to the functions was defined... the function return type, then the pointer to the function with the variable's name, all enclosed in parenthesis, and then follow the same rules as defining the parameter list for a function, again within another set of parenthesis.

type(*variable_name)(int,int,float)

Now before leaving this idea, lets expand this one more time, into an array of functions. It took a litle experimenting to find the right syntax for this, but here it is:

type(*variable_name[size] )(int,int,float)

Here is an example showing how this works, functions.c:

#include <stdio.h>
#include <stdlib.h>

int (*function)(int);

int f1( int in )
{
  printf( "F1\n" );
  return in+1;
}

int f2( int in )
{
  printf( "F2\n" );
  return in-1;
}

int f3( int in )
{
  float recip = 1.0 / (float)in;
  
  printf( "F3\n" );
  return (int)(recip*65536.0);
}

int f4( int in )
{
  printf( "F4\n" );
  return in<<1;
}

int (*functions[4])(int) = { f1, f2, f3, f4 };

int main()
{
  int loop;
  
  function = f1;
  printf( "%d\n", function( 10 ) );
  
  function = f2;
  printf( "%d\n", function( 10 ) );
  
  function = f3;
  printf( "%d\n", function( 10 ) );
  
  function = f4;
  printf( "%d\n", function( 10 ) );
  
  printf( "And now...\n" );
  for( loop = 0; loop < 4; loop++ )
  {
    printf( "%d\n", functions[loop](10) );
  }
  
  return 0;
}

And one final argument in favor of this technique, consider the code:

    switch( control )
    {
      case 0:
        do_0(...);
        break;
      case 1:
        do_1(...);
        break;
      case 2:
        do_2(...);
        break;
      case 3:
        do_3(...);
        break;
    }

Which becomes implimented as:

    if( control == 0 )
      do_0(...);
    else if( control == 1 )
      do_1(...);
    else if( control == 2 )
      do_2(...);
    else if( control == 3 )
      do_3(...);

I suppose all of that is fine, but look at how much time is wasted as selections are farther down the list... especially if that list is a couple hundred choices. Every if() statement has to be evaluated, and if this is a very long list, processing time and efficiency suffer. consider, instead:

  void (*do_function[4])(...) = { d0_0, do_1, do_2, do_3 };
  
  ...
  
  do_function[ control ](...);

Notice in this example, there is no wasted time processing "which function do I want to call?" Instead, it uses the control index to select the function, and calls it directly. For those who are proficient in assembly language, you can compile the code to assembly and look at the result. Yes, it is ugly code, but it is surprising how much cleaner and faster this technique is.

I also found a note on one of the StackOverflow pages that suggested using a typedef to make the code somewhat easier to read:

  typedef unsigned short (*func_table_t)(void*);
  
  func_table_t func_table [MAX_FUNC] =
  {
    some_function,
    some_other_function,
    ...
  };

I have no trouble admitting that is easier to read.

NOTE, if you do come across that web page, here, you will notice they mistakenly put & in front of the functions names. The author of that answer realized they needed pointers to the functions, but the function names are already inherently pointers.

Typedefs for Structs, Unions, and Enums

It's perhaps a little embarassing, but from time to time, I struggle with this one. I think that typedef is a very inportant feature in C when working with structs, unions, and enums, because they can go a long way to keeping code clean and readable.

Normally, it's easy ehough to just:

struct structure_name
{
  int one;
  int two;
  char three;
  short four;
} variables;

but unless you are defining one of those evil global variables, this is a messy, hard to read way of writing code. If you ever need to define another instance of that structure, you have to be explicit:

int in_some_function()
{
  struct structure_name variable names;
}

Why not do this....?

typedef struct structure_name_is_now_optional
{
  int one;
  int two;
  char three;
  short four;
} clean_name;

clean_name global_variable_name;

int in_some_function()
{
  clean_name variable names;
}

The only drawback to this is when working with linked lists, you can't use "clean_name" inside the structure definition without undefined sytax errors. in these cases, you have to forward-declare the structure....

struct node_type;

typedef struct node_type
{
  int one;
  int two;
  char three;
  short four;
  struct node_type *next;
} data_node;

data_node list_name;

It's still a little ugly, but within the rest of the code, it will be much, much cleaner and easier to read.

Inline Functions

I came across this weird tidbit regarding inline functions in GCC:

"GCC does not inline any functions when not optimizing unless you specify the always_inline attribute for the function, like this:"

/* Prototype */
inline void foo (const char) __attribute__((always_inline));

Static Functions

A "static" function in C is one that remains visible only within the scope of the current file. Most functions are visible, externally, and can be linked. In fact, in the assembly code, a function will receive a .globl instruction when compiled. static overrides this default behavior and the function can not be linked to from other files.

Wenton's email (wenton@ieee.org)

home