The transformations it makes on its input form the first four of C's so-called Phases of Translation. Though an implementation may choose to perform some or all phases simultaneously, it must behave as if it performed them one-by-one in order.
Contents |
Including files
The most common use of the preprocessor is to include another file:#include
int main (void)
{
printf("Hello, world!\n");
return 0;
}
#include
with the system header file of that name, which declares the printf()
function amongst other things. More precisely, the entire text of the file 'stdio.h' replaces the #include
directive.This can also be written using double quotes, e.g.
#include "stdio.h"
. If the filename is enclosed within angle brackets, the file is searched for in the standard compiler include paths. If the filename is enclosed within double quotes, the search path is expanded to include the current source directory. C compilers and programming environments all have a facility which allows the programmer to define where include files can be found. This can be introduced through a command line flag, which can be parameterized using a makefile, so that a different set of include files can be swapped in for different operating systems, for instance.By convention, include files are given a .h extension, and files not included by others are given a .c extension. However, there is no requirement that this be observed. Occasionally you will see files with other extensions included, in particular files with a .def extension may denote files designed to be included multiple times, each time expanding the same repetitive content.
#include
often compels the use of #include
guards or #pragma once
to prevent double inclusion.Conditional compilation
The#if
, #ifdef
, #ifndef
, #else
, #elif
and #endif
directives can be used for conditional compilation.#define __WINDOWS__
#ifdef __WINDOWS__
#include
#else
#include
#endif
#if VERBOSE >=2
print("trace message");
#endif
__WINDOWS__
. The macro could be defined implicitly by the compiler, or specified on the compiler's command line, perhaps to control compilation of the program from a makefile.The subsequent code tests if a macro
__WINDOWS__
is defined. If it is, as in this example, the file
is included, otherwise
.Macro definition and expansion
There are two types of macros, object-like and function-like. Object-like macros do not take parameters; function-like macros do. The generic syntax for declaring an identifier as a macro of each type is, respectively,#define
#define( )
Wherever the identifier appears in the source code it is replaced with the replacement token list, which can be empty. For an identifier declared to be a function-like macro, it is only replaced when the following token is also a left parenthesis that begins the argument list of the macro invocation. The exact procedure followed for expansion of function-like macros with arguments is subtle.
Object-like macros were conventionally used as part of good programming practice to create symbolic names for constants, e.g.
#define PI 3.14159
An example of a function-like macro is:
#define RADTODEG(x) ((x) * 57.29578)
RADTODEG(34)
or RADTODEG (34)
. This is expanded in-place, so the caller does not need to litter copies of the multiplication constant all over his code. The macro here is written as all uppercase to emphasize that it is a macro, not a compiled function.Precedence
Note that the example macro RADTODEG(x) given above uses normally superfluous parentheses both around the argument and around the entire expression. Omitting either of these can lead to unexpected results. For example:- Macro defined as
#define RADTODEG(x) (x * 57.29578)
RADTODEG(a + b)
(a + b * 57.29578)
- Macro defined as
#define RADTODEG(x) (x) * 57.29578
1 / RADTODEG(a)
1 / (a) * 57.29578
Multiple lines
A macro can be extended over as many lines as required using a backslash escape character at the end of each line. The macro ends after the first line which does not end in a backslash.The extent to which multi-line macros enhance or reduce the size and complexity of the source of a C program, or its readability and maintainability is open to debate (there is no experimental evidence on this issue).
Multiple evaluation of side effects
Another example of a function-like macro is:#define MIN(a,b) ((a)>(b)?(b):(a))
?:
operator. This illustrates one of the dangers of using function-like macros. One of the arguments, a or b, will be evaluated twice when this "function" is called. So, if the expression MIN(++firstnum,secondnum)
is evaluated, then firstnum may be incremented twice, not once as would be expected.A safer way to achieve the same would be to use a typeof-construct:
#define max(a,b) \
({ typeof (a) _a = (a); \
typeof (b) _b = (b); \
_a > _b ? _a : _b; })
typeof
keyword, and the construct of placing a compound statement within parentheses, are non-standard extensions implemented in the popular GNU C compiler (GCC). If you are using GCC, the same general problem can also be solved using a static inline function, which is as efficient as a #define
. The inline function allows the compiler to check/coerce parameter types -- in this particular example this appears to be a disadvantage, since the 'max' function as shown works equally well with different parameter types, but in general having the type coercion is often an advantage.Within ANSI C, there is no reliable general solution to the issue of side-effects in macro arguments.
Token concatenation
Token concatenation, also called token pasting, is one of the most subtle — and easy to abuse — features of the C macro preprocessor. Two arguments can be 'glued' together using##
preprocessor operator; this allows two tokens to be concatenated in the preprocessed code. This can be used to construct elaborate macros which act much like C++ templates (without many of their benefits).For instance:
#define MYCASE(item,id) \
case id: \
item##_##id = id;\
break
switch(x) {
MYCASE(widget,23);
}
MYCASE(widget,23);
gets expanded here intocase 23:
widget_23 = 23;
break;
MYCASE
becomes the semicolon that completes the break statement.)Semicolons
One stylistic note about the above macro is that the semicolon on the last line of the macro definition is omitted so that the macro looks 'natural' when written. It could be included in the macro definition, but then there would be lines in the code without semicolons at the end which would throw off the casual reader. Worse, the user could be tempted to include semicolons anyway; in most cases this would be harmless (an extra semicolon denotes an empty statement) but it would cause errors in control flow blocks:#define PRETTY_PRINT(s) \
printf ("Message: \"%s\"\n", s);
if (n < 10)
PRETTY_PRINT("n is less than 10");
else
PRETTY_PRINT("n is at least 10");
printf
and an empty statement – in each branch of the if/else construct, which will cause the compiler to give an error message similar to:error: expected expression before ‘else’
—gcc 4.1.1
Multiple statements
Inconsistent use of multiple-statement macros can result in unintended behaviour. The code#define CMDS \
a = b; \
c = d
if (var == 13)
CMDS;
else
return;
if (var == 13)
a = b;
c = d;
else
return;
else
is lacking a matching if
).The macro can be made safe by replacing the internal semicolon with the comma operator, since two operands connected by a comma form a single statement. The comma operator is the lowest precedence operator. In particular, its precedence is lower than the assignment operator's, so that a = b, c = d does not parse as a = (b,c) = d. Therefore,
#define CMDS a = b, c = d
if (var == 13)
CMDS;
else
return;
if (var == 13)
a = b, c = d;
else
return;
#define CMDS \
do { \
a = b; \
c = d; \
} while (0)
if (var == 13)
do {
a = b;
c = d;
} while (0);
else
return;
do
and while (0)
are needed to allow the macro invocation to be followed by a semicolon; if they were omitted the resulting expansion would beif (var == 13) {
a = b;
c = d;
}
;
else
return;
else
by preventing it matching up with the preceding if
.Quoting macro arguments
Although macro expansion does not occur within a quoted string, the text of the macro arguments can be quoted and treated as a string literal by using the "#
" directive(also known as the "Stringizing Operator"). For example, with the macro#define QUOTEME(x) #x
printf("%s\n", QUOTEME(1+2));
printf("%s\n", "1+2");
#define dumpme(x, fmt) printf("%s:%u: %s=" fmt, __FILE__, __LINE__, #x, x)
int some_function() {
int foo;
/* [a lot of complicated code goes here] */
dumpme(foo, "%d");
/* [more complicated code goes here] */
}
Indirectly quoting macro arguments
The "#
" directive can also be used indirectly. For example, with the macro:#define FOO bar
#define _QUOTEME(x) #x
#define QUOTEME(x) _QUOTEME(x)
printf("FOO=%s\n", QUOTEME(FOO));
printf("FOO=%s\n", "bar");
Variadic macros
printf
, for example when logging warnings and errors.X-Macros
One little-known usage-pattern of the C preprocessor is known as "X-Macros". An X-Macro is an #include file (commonly using a ".def" extension instead of the traditional ".h") that contains a list of similar macro calls (which can be referred to as "component macros"). The include file is then referenced repeatedly in the following pattern:(Given that the include file is "xmacro.def" and it contains a list of component macros of the style "foo(x, y, z)")
#define foo(x, y, z) doSomethingWith(x, y, z);
#include "xmacro.def"
#undef foo
#define foo(x, y, z) doSomethingElseWith(x, y, z);
#include "xmacro.def"
#undef foo
(etc...)
Common sets of objects are a set of global configuration settings, a set of members of a structure, a list of possible XML tags for converting an XML file to a quickly traversable tree or the body of an enum declaration, although other lists are possible.
Once the X-Macro has been processed to create the list of objects, the component macros can be redefined to generate, for instance, accessor and/or mutator functions. Structure serializing and deserializing are also commonly done.
Here is an example of an X-Macro that establishes a struct and automatically creates serialize/deserialize functions:
(Note: for simplicity, we don't account for endianness or buffer overflows)
File: object.def
struct_member( x, int );
struct_member( y, int );
struct_member( z, int );
struct_member( radius, double );
File: star_table.c
typedef struct
{
#define struct_member( name, type ) type name;
#include "object.def"
#undef struct_member
} star;
void serialize_star( const star *_star, unsigned char *buffer )
{
/* Copy each member's data into buffer and move the pointer. */
#define struct_member( name, type ) memcpy(buffer, (unsigned char *) &(_star->name), sizeof(_star->name) ); buffer += sizeof(_star->name);
#include "object.def"
#undef struct_member
}
void deserialize_star( star *_star, const unsigned char *buffer )
{
/* Copy each member's data out of buffer and move the pointer. */
#define struct_member( name, type ) memcpy((unsigned char *) &(_star->name), buffer, sizeof(_star->name) ); buffer += sizeof(_star->name);
#include "object.def"
#undef struct_member
}
void print_int( int val )
{
printf( "%d", val )
}
void print_double( double val )
{
printf( "%g", val )
}
void print_star( const star *_star )
{
/* print_##type will be replaced with print_int or print_double */
#define struct_member( name, type ) printf( "%s: ", #name ); print_##type( _star->name ); printf("\n");
#include "object.def"
#undef struct_member
}
#define object_def \
struct_member( x, int ); \
struct_member( y, int ); \
struct_member( z, int ); \
struct_member( radius, double );
void print_star( const star *_star )
{
/* print_##type will be replaced with print_int or print_double */
#define struct_member( name, type ) printf( "%s: ", #name ); print_##type( _star->name ); printf("\n");
object_def
#undef struct_member
}
User-defined compilation errors and warnings
The#error
directive inserts an error message into the compiler output.#error "Gaah!"
#define
has been introduced from the makefile, e.g.:#ifdef WINDOWS
... /* Windows specific code */
#elif defined(UNIX)
... /* Unix specific code */
#else
#error "What's your operating system?"
#endif
#warning
directive to print out a warning message in the compiler output, but not stop the compilation process. A typical use is to warn about the usage of some old code, which is now unfavored and only included for compatibility reasons, e.g.:#warning "Do not use ABC, which is deprecated. Use XYZ instead."
#error
or #warning
directive does not have to be quoted, it is good practice to do so. Otherwise, there may be problems with apostrophes and other characters that the preprocessor tries to interpret. Microsoft C uses #pragma message ( "text" )
instead of #warning
.Compiler-specific preprocessor features
The#pragma
directive is a compiler specific directive which compiler vendors may use for their own purposes. For instance, a #pragma
is often used to allow suppression of specific error messages, manage heap and stack debugging, etc.C99 introduced a few standard
#pragma
directives, taking the form #pragma STDC …
, which are used to control the floating-point implementation.Standard positioning macros
Certain symbols are predefined in ANSI C. Two useful ones are__FILE__
and __LINE__
, which expand into the current file and line number. For instance:// debugging macros so we can pin down message provenance at a glance
#define WHERESTR "[file %s, line %d] "
#define WHEREARG __FILE__,__LINE__
printf(WHERESTR ": hey, x=%d\n", WHEREARG, x);
x
, preceded by the file and line number, allowing quick access to which line the message was produced on. Note that the WHERESTR
argument is concatenated with the string following it.Compiler-specific predefined macros
Compiler-specific predefined macros are usually listed in the compiler documentation, although this is often incomplete. The Pre-defined C/C++ Compiler Macros project lists "various pre-defined compiler macros that can be used to identify standards, compilers, operating systems, hardware architectures, and even basic run-time libraries at compile-time".Some compilers can be made to dump at least some of their useful predefined macros, for example:
- GNU C Compiler
gcc -dM -E - < /dev/null
- HP-UX ansi C compiler
cc -v fred.c
(wherefred.c
is a simple test file)- SCO OpenServer C compiler
cc -## fred.c
(wherefred.c
is a simple test file)- Sun Studio C/C++ compiler
cc -## fred.c
(wherefred.c
is a simple test file)
No comments:
Post a Comment