C++系统相关辅导(C++编译器性能比较)-计算机等级二级考试网-优易学网

C++系统相关辅导(C++编译器性能比较)

来源：优易学 2011-11-20 14:01:53 【优易学：中国教育考试门户网】资料下载 IT书店

　　现在市面上，主流的C/C++编译器包括M$的CL、gcc、Intel的icl、PGI的pgcc及Codegear的bcc（原来属于Borland公司）。Windows上使用最多的自然是cl，而在更广阔的平台上，gcc则是C/C++编译器的首选。但要提到能力优化，排名就未必与它们的市场占有率一致了。
　　做了一个各编译器数值性能的比较。测试的代码是一个求积分的程序，青年人网提示来源于intel编译器的例子程序，修改了一个头文件，以便每个编译器都能编译。
　　#include <stdio.h>
　　#include <stdlib.h>
　　#include <time.h>
　　#include <math.h>
　　// Function to be integrated
　　// Define and prototype it here
　　// | sin(x) |
　　#define INTEG_FUNC(x) fabs(sin(x))
　　// Prototype timing function
　　double dclock(void);
　　int main(void)
　　{
　　// Loop counters and number of interior points
　　unsigned int i, j, N;
　　// Stepsize, independent variable x, and accumulated sum
　　double step, x_i, sum;
　　// Timing variables for evaluation
　　double start, finish, duration, clock_t;
　　// Start integral from
　　double interval_begin = 0.0;
　　// Complete integral at
　　double interval_end = 2.0 * 3.141592653589793238;
　　// Start timing for the entire application
　　start = clock();
　　printf(" \n");
　　printf(" Number of | Computed Integral | \n");
　　printf(" Interior Points | | \n");
　　for (j=2;j<27;j++)
　　{
　　printf("------------------------------------- \n");
　　// Compute the number of (internal rectangles + 1)
　　N = 1 << j;
　　// Compute stepsize for N-1 internal rectangles
　　step = (interval_end - interval_begin) / N;
　　// Approx. 1/2 area in first rectangle: f(x0) * [step/2]
　　sum = INTEG_FUNC(interval_begin) * step / 2.0;
　　// Apply midpoint rule:
　　// Given length = f(x), compute the area of the
　　// rectangle of width step
　　// Sum areas of internal rectangle: f(xi + step) * step
　　for (i=1;i<N;i++)
　　{
　　x_i = i * step;
　　sum += INTEG_FUNC(x_i) * step;
　　}
　　// Approx. 1/2 area in last rectangle: f(xN) * [step/2]
　　sum += INTEG_FUNC(interval_end) * step / 2.0;
　　printf(" %10d | %14e | \n", N, sum);
　　}
　　finish = clock();
　　duration = (finish - start);
　　printf(" \n");
　　printf(" Application Clocks = %10e \n", duration);
　　printf(" \n");
　　return 0;
　　}
　　当然，这个代码来自于intel，当然非常适合intel的编译器。以下的测试在Intel Core 2 Duo上进行。
　　gcc (GCC TDM-2 for MinGW) 4.3.0 VC 9.0 (cl 15.00.21022.08) Intel (icl 10.1) PGI (pgcc 7.16) CodeGear (bcc32 6.10)
　　禁止优化
　　-O0 /Od -Od -O0 -Od
　　17161 14461 12441 10514 13400
　　17133 14430 11687 9956 12917
　　17155 14476 11871 10099 13026
　　编译选项 -O2
　　13011 7737 4540 9348 12636
　　16571 7706 4185 9148 13026
　　16573 7706 4042 9183 13057
　　针对平台的优化
　　-march=core2 -O2 /arch:SSE2 /O2 -QxT -tp core2 -O2 无
　　16060 7710 1938 9578
　　测试的结果说明，在数值计算方法，intel的编译器是非常利害的，特别是针对某CPU的优化，能提高很多性能。GCC表现却有些让人失望。在禁止优化到-O2级优化的对比中，可以看出intel与m$的编译器的优化效果是非常明显的，而其它编译器优化后的提高非常有限。如果给个排名，那么将是 icl>cl>pgcc>bcc>gcc。
　　另外，在一台P4 1.5G的机器，linux环境下，测试得到
　　gcc icc pgCC
　　-O2 -O2 -O2
　　24920000 10840000 22270000
　　-O0 -O0 -O0
　　28290000 19210000 24320000
　　-march=pentium4 -O2 -xN -tp piv -O2
　　24990000 6640000 22150000
　　同样，还是intel的表现最好，而gcc最差。
　　又在Athlon X2 4800+, Linux上测试，得到下表
　　gcc icc pgcc
　　-O0 -O0 -O0
　　9390000 14950000 9950000
　　-O2 -O2 -O2
　　8910000 9240000 9400000
　　-march=amdfam10 -O2 -msse3 -O2 -tp k8-32 -O2
　　8800000 3800000 9030000
　　虽然icc主要是针对intel的处理器，但只要优化选项找对，同样能带给amd cpu性能的巨大提高。gcc也回归到普通水平。奇怪的是pgi的编译器，估计是我还没找到好的选项吧。
　　总结看来，在数值计算方法，“最快”的选择应该属于intel。

责任编辑：小草