[轉]Preprocessor(預處理器)

原文
http://pl-learning-blog.logdown.com/posts/1049271-usually-terror-words-o-muhammad-c-ch11-reading-notes-unfinished

11.1 替換file的內容 (include)

第五章，接觸了include這個preprocessor的命令。這個命令是說在#include < file名稱 >的情況下，從MinGW底下的include或/usr/include資料夾下找到同名檔案並置換掉這一行。比如說，我們寫了

#include <stdio.h>

這一行，那麼就會被換成以下東西:

# 28 "/usr/include/stdio.h" 2 3 4
# 1 "/usr/lib/gcc/x86_64-linux-gnu/5/include/stddef.h" 1 3 4
# 216 "/usr/lib/gcc/x86_64-linux-gnu/5/include/stddef.h" 3 4
# 216 "/usr/lib/gcc/x86_64-linux-gnu/5/include/stddef.h" 3 4
typedef long unsigned int size_t;
# 34 "/usr/include/stdio.h" 2 3 4
# 1 "/usr/include/x86_64-linux-gnu/bits/types.h" 1 3 4
# 27 "/usr/include/x86_64-linux-gnu/bits/types.h" 3 4
# 1 "/usr/include/x86_64-linux-gnu/bits/wordsize.h" 1 3 4
# 28 "/usr/include/x86_64-linux-gnu/bits/types.h" 2 3 4
typedef unsigned char __u_char;
typedef unsigned short int __u_short;
typedef unsigned int __u_int;
typedef unsigned long int __u_long;
typedef signed char __int8_t;
typedef unsigned char __uint8_t;
typedef signed short int __int16_t;
typedef unsigned short int __uint16_t;
typedef signed int __int32_t;
typedef unsigned int __uint32_t;

typedef signed long int __int64_t;
typedef unsigned long int __uint64_t;

typedef long int __quad_t;
typedef unsigned long int __u_quad_t;
# 121 "/usr/include/x86_64-linux-gnu/bits/types.h" 3 4
# 1 "/usr/include/x86_64-linux-gnu/bits/typesizes.h" 1 3 4
# 122 "/usr/include/x86_64-linux-gnu/bits/types.h" 2 3 4

typedef unsigned long int __dev_t;
typedef unsigned int __uid_t;
typedef unsigned int __gid_t;
typedef unsigned long int __ino_t;
typedef unsigned long int __ino64_t;
typedef unsigned int __mode_t;
typedef unsigned long int __nlink_t;
typedef long int __off_t;
typedef long int __off64_t;
typedef int __pid_t;
typedef struct { int __val[2]; } __fsid_t;
typedef long int __clock_t;
typedef unsigned long int __rlim_t;
typedef unsigned long int __rlim64_t;
typedef unsigned int __id_t;
typedef long int __time_t;
typedef unsigned int __useconds_t;
typedef long int __suseconds_t;

typedef int __daddr_t;
typedef int __key_t;

/*中間好多行*/

extern void funlockfile (FILE *__stream) __attribute__ ((__nothrow__ , __leaf__));
# 942 "/usr/include/stdio.h" 3 4

在稍微詳細說明一下include命令吧。file名稱通常是指定一種叫head file的file。head file一般來說會有data type、常數、還有函數的定義。用<>括起來的東西，會在MinGW底下的/include或/usr/include資料夾這兩個標準C語言library的head file放置處，尋找相同名稱的檔案。

如果想自己寫很多函數的話，那麼就會想要一個專為那群函數所用的head file。在這種時候，用""把自己寫的head file名稱括起來。這樣就會在跟source code一樣的目錄中尋找。首先來把範例程式build起來吧。

source code
main.c

#include <stdio.h>
#include "functions.h" //this head file the functions made by myself

int main(){

    int num_1;
    int num_2;
    int answer;

    num_1 = 1;
    num_2 = 2;

    //execute function sum() which prototype is defined in functions.h
 answer = sum(num_1, num_2);
    printf("answer = %d\n", answer);

    answer = sub(num_1, num_2);
    printf("answer = %d\n", answer);

    answer = mul(num_1, num_2);
    printf("answer = %d\n", answer);

    answer = div(num_1, num_2);
    printf("answer = %d\n", answer);

    return 0;
}

source code
function.h

int sum(int, int);
int sub(int, int);
int mul(int, int);
int div(int, int);

source code
function.c

int sum(int a, int b){

    int return_value;

    return_value = a + b;

    return (return_value);
}

int sub(int a, int b){

    int return_value;

    return_value = a - b;

    return (return_value);
}

int mul(int a, int b){

    int return_value;

    return_value = a * b;

    return (return_value);
}

int div(int a, int b){

    int return_value;

    return_value = a / b;

    return (return_value);
}

compile時要下三條指令

gcc -g -Wall -c -o main.o main.c
gcc -g -Wall -c -o functions.o functions.c
gcc -g -Wall -o include main.o functions.o

這樣會生出一個名為include的執行檔。如果是makefile的話，就會寫成這樣:

PROGRAM   = include
OBJS      = main.o functions.o
SRCS      = $(OBJS: %.o=%.c)
CC        = gcc
CFLAGS    = -g -Wall
LDFLAGS   =
$(PROGRAM):$(OBJS)

    $(CC) $(CFLAGS) $() -o $(PROGRAM) $(OBJS) $(LDLIBS)

因為這次是自己定義的head file實驗，所以試試了「分割compile」。來解釋一下compile那三行命令的意義吧。第一行是compile main.c，assemble後生成main.o; 第二行是compile functions.c，assemble後生成functions.o; 第三行是把main.o跟functions.o一起跟標準C library link起來，生成include這個執行檔。

雖然有點繞遠路，但讓我們來思考一下為何分割compile還有自己定義的head file會很重要吧。在第十章時自己寫函數的時候，說明了一定要宣告prototype。為了分割compile，我們把程式分成好幾個file去寫，在functions.c裡追加了新的函數; 這樣的話，就不需要在main.c中追加新的prototype宣告。

還有一點，就是main.c跟function.c的開發者是不同人的情況。假如我們一開始是把prototype宣告寫在main.c，而且寫function.c的人把函數的宣告稍微改了一下，但寫main.c的人不知道這件事所以沒改prototype。在這種情況下，就算compile跟link都沒異常的結束了，在執行時也可能會出現錯誤。

要避免這種事，寫function.c的人也要同時負責寫宣告prototype的functions.h，而寫main.c的人只要include functions.h就可以。這樣的話，如果function.c有變更的話compile時就會出現警告error，寫main.c的人注意到就可以修正用到函數的地方。

還有，一人在開發的時候，如果想讓別的程式也使用function.c的話，只要複製那個file(?)跟functions.h，讓別的程式去include，就可以使用function.c的函數。

因為這樣的理由，所以我們在把程式分成好幾個檔案去寫並分割compile時，我們也會分開去定義head file。標準C library的函數的定義都被整理在head file裡面，也是這個理由所造成的。比如說，如果沒有標準C library的話，想要利用到printf()或fgets()函數時，每一次都要自己在程式開頭宣告prototype，如下圖:

int printf(const char *format, ...);
char *fgets (char *str, int size, FILE *stream);
typedef long time_t;
time_t time(time_t *tloc);
char *ctime(const time_t *clock);

每次都要寫這些，很麻煩吧，但是像下面這樣寫就輕鬆多了:

#include <stdio.h>
#include <time.h>

下面是include這個執行檔的執行畫面:
執行結果:

answer = 3
answer = -1
answer = 2
answer = 0

就和預想的一樣吧。那麼，我們輸入以下指令，來看看main.c中#include "function.h"是怎麼置換的吧:

gcc -E main.c > main.txt

可以發現輸出的txt檔有以下訊息:

//前面省略
# 2 "main.c" 2
# 1 "functions.h" 1

# 1 "functions.h"
int sum(int, int);
int sub(int, int);
int mul(int, int);
int div(int, int);
# 3 "main.c" 2
//以後省略

順代一提，include這個命令除了head file外，還可以用在讀取的置換上。就當做是玩玩把下面的範例build看看吧。

source code
a.c

#include "a.5"

;
    return 0;
}

a.1

#include <stdio.h>

a.2

#include "a.1"

int

a.3

#include "a.2"

main(){

a.4

printf

a.5

#include "a.3"
#include "a.4"

("hello, world\n")

只要打

gcc -g -Wall a.c -o a

再打

./a

就會執行經典的hello, world程式了

執行結果:

hello, world

只要是文字文件什麼都可以include，雖然絕對不會像上面這樣用就是了，不過刻意這樣做可以讓大家有更強的「include就是文字置換」的感覺。

11.2 一對一置換 (define)

還有一個跟include類似的define命令。不過跟include置換檔案不同，define是可以置換source code中的內容。用法如下:

#define 置換前 置換後

第九章求出質數的peime_3.c的範例程式，之中的end = 100;代表我們可以求到100為止的質數。可以用define來改寫成下面這樣:

#define MAX 100
 end = MAX;

在preprocess的時候，MAX這字串就會因為define的命令通通換成100。趕快來實驗看看吧。

source code
define.c

#include <stdio.h>

#define INTEGER_NUM_1 100
#define FLOAT_NUM_1 3.14
#define STRING_1 "%s"
#define STRING_2 "hello, world\n"

int main(){

    #define INTEGER_NUM_2 200
 #define FLOAT_NUM_2 2.71

    int a = INTEGER_NUM_1;
    printf("%d %d\n", a, INTEGER_NUM_2);

    float b = FLOAT_NUM_1;
    float c = FLOAT_NUM_2;

    printf("%f %f\n", b,c);

    printf(STRING_1, STRING_2);

    return 0;
}

執行結果:

100 200
3.140000 2.710000
hello, world

如何呢? 應該能了解就是單純的文字替換吧。

11.3 把define的東西無效化 (undef)

一但寫了define的命令，就對宣告那一行之後的程式碼就一直都有效。如果因為什麼理由而想把define無效化的話，可以用undef。使用方法很簡單如下:

#undef 想無效化的define

範例:

#undef INTEGER_NUM_1

可以對剛剛的define.c實驗看看，在main()之前寫上這一行，int a = INTEGER_NUM_1;這一行就會出現錯誤。

11.4 有條件的compile (if)

這個define，還有另一個使用方法; 而搭配這個使用方法的是if。

根據值的內容做有條件的compile (if)

使用if的話，會根據被define的值讓source code本身分歧(這時還沒被compile)。非常難懂，所以來實驗一下下面的範例程式吧。

source code
define_2.c

#define TEST 1

#if TEST == 1
#include <stdio.h>

int main(){

    printf("hello, world\n");

    return 0;
}

#else
#endif

我們先打下面這一行:

gcc -E define_2.c > define_2.txt

再來看看define_2.txt的內容，會發現最底下有main()函數的內容。那麼，我們把#define TEST 1改成#define TEST 0實驗看看吧。我們會發現，define_2.txt的內容如下，幾乎是空的:

# 1 "define_2.c"
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "/usr/include/stdc-predef.h" 1 3 4
# 1 "<command-line>" 2
# 1 "define_2.c"

原因的話，在#define TEST 1的時候，符合#if TEST == 1的條件，所以會去用到#if~#else的範圍，但是當#define TEST 0的時候因為和if條件不符，所以會用到#else~#endif的範圍，也就是空的。

這次，我們把#define TEST 1這一行給刪掉，分別下這兩行命令:

gcc -E -DTEST=1 define_2.c > define_2.txt
gcc -E -DTEST=0 define_2.c > define_2.txt

這兩行命令所產生的define_2.txt內容不同，-DTEST=1是當作有main()函數，-DTEST=0則是沒有。像這樣，define的內容也可以透過preprocessor的option指定。這個機能非常特別，C語言以外很少看到。比如說程式中分別寫了Windows用還有Linux用兩種，可以使用Makefile的option來對應兩方的環境。為了這種用途，define在沒有寫該置換什麼的話，那麼就會自動補1，也就是說，

#define TEST

跟

#define TEST 1

是相同的。

調查是否已經定義過的條件compile

就算不用一個一個的寫if後面的條件，用ifdef這命令的話就跟#if TEST == 1是同樣的效果，反過來也有跟#if TEST == 0一樣意思的ifndef可以用。範例如下:

#ifdef TEST
#include <stdio.h>

int main(){

    printf("hello, world\n");

    return 0;
}

#else
#endif

#ifndef TEST
#else
#include <stdio.h>

int main(){

    printf("hello, world\n");

    return 0;
}
#endif

(筆者注: 筆者不知道這樣寫有什麼意義，因為不管TEST是1是0都是執行一樣的程式，再說，#define跑到哪了呢？)

已經定義的define

gcc的命令已經在內部define了許多的字串。比如說linux的gcc，就算gcc什麼option都不加，也在preprocess的時候#define linux 1。也就是說，如果想寫一個可以跨平台(linux,windows等不同作業系統)的程式，可以像下面這樣寫:

#ifdef linux
 //Linux專用處理
#else
 //Linux以外的OS的專用處理
#endif

如果打了以下命令

gcc -dM -xc -E /dev/null

就可以列出linux已經#define好的字串，筆者電腦顯示如下:

#define __SSP_STRONG__ 3
#define __DBL_MIN_EXP__ (-1021)
#define __UINT_LEAST16_MAX__ 0xffff
#define __ATOMIC_ACQUIRE 2
#define __FLT_MIN__ 1.17549435082228750797e-38F
#define __GCC_IEC_559_COMPLEX 2
#define __UINT_LEAST8_TYPE__ unsigned char
#define __SIZEOF_FLOAT80__ 16
#define __INTMAX_C(c) c ## L
#define __CHAR_BIT__ 8
#define __UINT8_MAX__ 0xff
#define __WINT_MAX__ 0xffffffffU
#define __ORDER_LITTLE_ENDIAN__ 1234
#define __SIZE_MAX__ 0xffffffffffffffffUL
#define __WCHAR_MAX__ 0x7fffffff
#define __GCC_HAVE_SYNC_COMPARE_AND_SWAP_1 1
#define __GCC_HAVE_SYNC_COMPARE_AND_SWAP_2 1
#define __GCC_HAVE_SYNC_COMPARE_AND_SWAP_4 1
#define __DBL_DENORM_MIN__ ((double)4.94065645841246544177e-324L)
#define __GCC_HAVE_SYNC_COMPARE_AND_SWAP_8 1
#define __GCC_ATOMIC_CHAR_LOCK_FREE 2
#define __GCC_IEC_559 2
#define __FLT_EVAL_METHOD__ 0
#define __unix__ 1
//中間好多行...
#define linux 1
#define __SSE2__ 1
#define __LDBL_MANT_DIG__ 64
#define __DBL_HAS_QUIET_NAN__ 1
#define __SIG_ATOMIC_MIN__ (-__SIG_ATOMIC_MAX__ - 1)
#define __code_model_small__ 1
#define __k8__ 1
#define __INTPTR_TYPE__ long int
#define __UINT16_TYPE__ short unsigned int
#define __WCHAR_TYPE__ int
#define __SIZEOF_FLOAT__ 4
#define __UINTPTR_MAX__ 0xffffffffffffffffUL
#define __DEC64_MIN_EXP__ (-382)
#define __INT_FAST64_MAX__ 0x7fffffffffffffffL
#define __GCC_ATOMIC_TEST_AND_SET_TRUEVAL 1
#define __FLT_DIG__ 6
#define __UINT_FAST64_TYPE__ long unsigned int
#define __INT_MAX__ 0x7fffffff
#define __amd64__ 1
#define __INT64_TYPE__ long int
#define __FLT_MAX_EXP__ 128
#define __ORDER_BIG_ENDIAN__ 4321
#define __DBL_MANT_DIG__ 53
#define __SIZEOF_FLOAT128__ 16
#define __INT_LEAST64_MAX__ 0x7fffffffffffffffL
#define __GCC_ATOMIC_CHAR16_T_LOCK_FREE 2
#define __DEC64_MIN__ 1E-383DD
//之後省略

如果想寫可以在複數環境下執行的程式，利用這命令，可以利用已經定義好的值來寫程式。

恐怖實驗:如果已經define的字串跟變數名相同

利用像是#define linux 1等等的話，在寫需要在複數環境下執行的程式時非常便利，但是其中有陷阱。下面的範例程式，會發生什麼事呢?

source code
define_3.c

#include <stdio.h>

int main(){

    char linux[16] = "For Linux!";

    printf("%s\n", linux);

    return 0;
}

在compile時會出現以下的警告跟錯誤信息

define_3.c: In function ‘main’:
define_3.c:5:7: error: expected identifier or ‘(’ before numeric constant
  char linux[16] = "For Linux!";
       ^
define_3.c:7:9: warning: format ‘%s’ expects argument of type ‘char *’, but argument 2 has type ‘int’ [-Wformat=]
  printf("%s\n", linux);
         ^

這是因為系統已經#define linux 1了，所以對我們的程式碼做置換，變成了

char 1[16] = "For Linux!";

所以就產生奇怪的結果。

在gcc裡定義的大部分#define都有__(兩個underscore)，所以不太發生這種事，但還是注意一下。除了linux可能會不小心跟#define重複到，其他還有i386,unix也是跟linux同樣的情形。

11.5 像函數的macro函數

define是可以把接收的參數置換的命令，也就是說，可以用它做出類似函數的東西，這被稱為「macro函數」。

macro函數的寫法

比如說，想寫出跟function.c的sum()一樣的macro函數，就要寫成下面這樣:

#define SUM(a, b) a + b

先來看看範例程式的執行吧。

source code
define_4.c

#include <stdio.h>

#define SUM(a, b) a + b
#define SUB(a, b) a - b
#define MUL(a, b) a * b
#define DIV(a, b) a / b

int main(){

    int num_1;
    int num_2;
    int answer;

    num_1 = 1;
    num_2 = 2;

    answer = SUM(num_1, num_2);
    printf("answer = %d\n", answer);

    answer = SUB(num_1, num_2);
    printf("answer = %d\n", answer);

    answer = MUL(num_1, num_2);
    printf("answer = %d\n", answer);

    answer = DIV(num_1, num_2);
    printf("answer = %d\n", answer);

    return 0;
}

執行結果:

answer = 3
answer = -1
answer = 2
answer = 0

跟預想一樣的結果吧。

macro函數不是函數!

這個macro函數，有著「函數名(參數)」這樣的型態，怎麼看都是函數。但是，實際上函數跟macro函數是完全不同的東西。要說的話，因為macro函數跟其它我們看過的define一樣，不過是單純的文字置換而已。

我們把define_4.c的範例程式編譯時gcc加上一些option觀察一下吧。

gcc -E define_4.c > define_4.txt

可以看到原本程式碼中的

answer = SUM(num_1, num_2);
printf("answer = %d\n", answer);

answer = SUB(num_1, num_2);
printf("answer = %d\n", answer);

answer = MUL(num_1, num_2);
printf("answer = %d\n", answer);

answer = DIV(num_1, num_2);
printf("answer = %d\n", answer);

在define_4.txt中的最後，變成了下面這樣子:

answer = num_1 + num_2;
 printf("answer = %d\n", answer);

 answer = num_1 - num_2;
 printf("answer = %d\n", answer);

 answer = num_1 * num_2;
 printf("answer = %d\n", answer);

 answer = num_1 / num_2;
 printf("answer = %d\n", answer);

 return 0;

可以發現SUM(num_1, num_2);的行蹤完全消失，變成了num_1 + num_2;，其他式子也是如此。

恐怖實驗: macro函數的陷阱

因為macro函數的便利，在C語言裡常常被利用。雖然之前的SUM()不怎麼實用所以不太可能會為它去寫macro函數; 不過因為是文字置換，不需要在意data type，所以我們會把兩個變數a, b的交換寫成像下面的macro函數:

#define SWAP(a, b) a^=b; b^=a; a^=b;

但是，macro函數是有陷阱的。首先來執行看看可以正確動作的的範例程式吧。

source code:
define_5.c

#include <stdio.h>

#define SWAP(a, b) a^=b; b^=a; a^=b;

int main(){

    int num_1;
    int num_2;

    num_1 = 1;
    num_2 = 2;

    printf("num_1 = %d, num_2 = %d\n", num_1, num_2);
    SWAP(num_1, num_2);
    printf("num_1 = %d, num_2 = %d\n", num_1, num_2);

    return 0;
}

執行結果

num_1 = 1, num_2 = 2
num_1 = 2, num_2 = 1

變數num_1跟變數num_2確實交換了。接下來，我們使用if，使得只有在num_1等於0的時候進行交換，先故意不用block吧，就像下面:

printf("num_1 = %d, num_2 = %d\n", num_1, num_2);
if(num_1 == 0)
    SWAP(num_1, num_2);
printf("num_1 = %d, num_2 = %d\n", num_1, num_2);

執行結果

num_1 = 1, num_2 = 2
num_1 = 2, num_2 = 3

執行結果很奇怪吧。原本範例程式中的範例程式變數num_1是1才對，macro函數SWAP()應該沒有實行，而且還置換了奇怪的數字。我們用gcc -E來調查原因吧。輸入:

gcc -E define_5.c > define_5.txt

在define_5.txt中，可以發現SWAP()的函數被置換變成下面這樣子:

printf("num_1 = %d, num_2 = %d\n", num_1, num_2);
if(num_1 == 0)
    num_1^=num_2; num_2^=num_1; num_1^=num_2;;
printf("num_1 = %d, num_2 = %d\n", num_1, num_2);

因為沒有{}，所以if的分歧處理就只有num_1^=num_2這一句; 要用{}解釋上面的code的話，就是下面這樣:

printf("num_1 = %d, num_2 = %d\n", num_1, num_2);
if(num_1 == 0){
    num_1^=num_2;
}
    num_2^=num_1; 
    num_1^=num_2;;
printf("num_1 = %d, num_2 = %d\n", num_1, num_2);

所以才會出現奇怪的結果。原本在if的下面block的話就沒問題了，共同作業的人也不一定會在if加上block，而且就算弄錯也不會出現警告error，我們希望可以讓SWAP()函數本身可以應對不加block的問題。利用置換文字的特性，寫法如下:

#define SWAP(a, b) {a^=b; b^=a; a^=b;}

這樣寫的話，就可以依照我們的想法執行了。

/*if不加block*/
printf("num_1 = %d, num_2 = %d\n", num_1, num_2);
if(num_1 == 0)
    {a^=b; b^=a; a^=b;};
printf("num_1 = %d, num_2 = %d\n", num_1, num_2);

/*if加block*/
printf("num_1 = %d, num_2 = %d\n", num_1, num_2);
if(num_1 == 0){
    {a^=b; b^=a; a^=b;};
}
printf("num_1 = %d, num_2 = %d\n", num_1, num_2);

順代一提，macro函數也可以寫出分歧這種複雜的處理。在這情況下，可能會因為使用的制御文而造成compile error，可以使用像下列的不重複的do while寫法。

#define SWAP(a, b) do{ \
                 // 這裡可以寫if \
                 (a)^=(b); \
                    (b)^=(a); \
                    (a)^=(b); \
                    }while(0)

這裡會寫成(a)還有(b)這樣把變數用括號括起來，是為了不要發生優先順位的問題。\是代表換行後define依然繼續的意思。還有，同常我們呼叫完函數後一定會加個;不過while(0)後面是不需要加的。

這些不是現在立刻要用到的知識，不過記起來也沒什麼損失。

11.6 結語

在這章，我們學習了C語言特有的preprocessor命令。

雖然是其他大部分的語言都沒有的機能所以不好懂，但是在gcc -E的命令下看到是怎樣置換的，應該會明白到這意外的簡單。

stdio.h等等的head file中有許多被define定義的常數，而為了讓UNIX系的OS可以像collection一樣利用list或quene的head file，其機能也幾乎都是用macro來實現的(為了使用時可以不受dat type影響)。

就算是說學習了preprocessor命令，能做的事也不會戲劇性的增加，不過C語言到處都會用到，所以記起來吧。

立你斯

立你斯學習記錄

立你斯發表在痞客邦留言(0) 人氣(409)

立你斯學習記錄

歡迎光臨立你斯在痞客邦的小天地..這裡主要轉貼我工作上有遇過的問題或看過的查過的資料....盡量轉成正體..留存