# Mac OS X Numerics Benchmarks

## AltiVec

##### 27 Jun 2003
Now that the new G5 Macs have been announced, I got more interested in AltiVec, which is also found in the G4 chips. I wrote a simple sum of the square roots program using vecLib.

My current daily machine is a Power Macintosh G4 (Mirrored Drive Doors) with twin 1 GHz CPUs, 1 GB of RAM, and 360 GB of drive space. This machine can sum 100,000,000 single-precision square roots in less than 5 seconds!

Here's the program, which will build on Mac OS X using GCC or under MPW with MRC:

```/*
*	altivec.c - AltiVec benchmark program.
*	altivec is Copyright Daniel K. Allen, 2003. (This program, not the Velocity Engine.)
*
*	26 Jun 2003 - Created by Dan Allen in MPW & Terminal simultaneously.
*
*	Dual 1 GHz G4 Power Mac running Mac OS X 10.2.6 times:
*
*		cc altivec.c -o altivec -framework vecLib -faltivec -O3 -mdynamic-no-pic
*
*			100,000,000 square roots in 4.69 seconds
*	  		1,000,000 square roots in  .05 seconds
*
*		MRC altivec.c -opt speed,unroll -vector on
*
*			100,000,000 square roots in 5.03 seconds
*	  		1,000,000 square roots in  .05 seconds
*
*
*/

#ifdef powerc
#include
#else
#include
#endif
#include
#include

typedef union {
vector float v;
float f[4];
} vf;

main(int argc,char *argv[])
{
vf a,b;
int i = 0,n;
double sum = 0;
clock_t t = clock();

n = (argc == 2) ? atoi(argv[1]) : 1000000;
while (i < n) {
a.f[0] = i++;
a.f[1] = i++;
a.f[2] = i++;
a.f[3] = i++;
b.v = vsqrtf(a.v);
sum += b.f[0];
sum += b.f[1];
sum += b.f[2];
sum += b.f[3];
}
t = clock() - t;
printf("Time: %.2f sec\n Sum: %d sqrts = %.8f\n",t/(float)CLOCKS_PER_SEC,i,sum);
return 0;
}

/*

MRC altivec.c -o altivec.o -opt speed,unroll -vector on
PPCLink -o altivec altivec.o "{PPCLibraries}InterfaceLib" "{PPCLibraries}MathLib" "{PPCLibraries}StdCLib" "{PPCLibraries}StdCRuntime.o" "{PPCLibraries}PPCCRuntime.o" "{PPCLibraries}PPCToolLibs.o"  "{PPCLibraries}vecLib"
SetFile altivec -d . -m . -t MPST -c 'MPS '

*/

```

## Floating Point

#### A quiet improvement in OS X 10.1.2 appears to be faster math library routines which have greatly improved numerics benchmark scores. GCC 2.95.2 now appears to have competitive codegen with MRC. In this case it turned out that improved math libraries in 10.1.2 make the difference.

Once again this proves that benchmarking rarely measures an individual component but reflects an entire system: CPU, memory, buses, disks, I/O, an operating system, the compiler, and libraries all contribute to the final result.

• All tests performed on a 450 MHz Power Macintosh G4 Cube with 512 MB of RAM.
• Well all except the Windows test, which was on a Dell OptiPlex 450 MHz Pentium III with 320 MB of RAM.
• The machine was also playing www.kpig.com at 128 Kbaud via iTunes where noted.
• All times are in seconds, thus smaller numbers mean better performance.

### Bench Scores

OS VersioniTunesCompilerInteger AdditionSum of SqrtsSimulationMost Remote (Trig)
Mac OS 9.2.2-MRC -opt speed1.170.230.070.30
Mac OS X 10.1-gcc -O31.091.150.110.49
Mac OS X 10.1.1-gcc -O31.081.290.110.49
Mac OS X 10.1.2-gcc -O31.090.250.070.27
Mac OS X 10.1.2 & 9.2.2-MRC -opt speed1.150.230.070.30
Windows 2000 SP2-Visual C++ 6.0 -Ox1.120.130.090.28
Mac OS X 10.1playinggcc -O31.111.350.120.51
Mac OS X 10.1.2playinggcc -O31.140.270.070.28
Mac OS X 10.1.2 & 9.2.2playingMRC -opt speed1.330.270.100.35
Mac OS 9.2.2playingMRC -opt speed1.250.250.080.33

## Mac OS X is now faster than 9.x!

##### Jun 2003
GCC 3.3 is now out and sometime I'll update these numbers for my faster machine and the new compiler.

The bench tool is a collection of small numeric loops that are common in scientific and engineering programming. This tool is written in C by Dan Allen, with contributions by Dr. Paul A. Finlayson of JPL. The benchmark scores used to take a long time. Fast machines will require us to update the benchmarks so that the differences are more apparent and are not in the noise.