Node:Stack alignment on x86, Previous:SIMD alignment and fftw_malloc, Up:Data Alignment
On the Pentium and subsequent x86 processors, there is a substantial performance penalty if double-precision variables are not stored 8-byte aligned; a factor of two or more is not unusual. Unfortunately, the stack (the place that local variables and subroutine arguments live) is not guaranteed by the Intel ABI to be 8-byte aligned.
Recent versions of gcc
(as well as most other compilers, we are
told, such as Intel's, Metrowerks', and Microsoft's) are able to keep
the stack 8-byte aligned; gcc
does this by default (see
-mpreferred-stack-boundary
in the gcc
documentation).
If you are not certain whether your compiler maintains stack alignment
by default, it is a good idea to make sure.
Unfortunately, gcc
only preserves the stack
alignment--as a result, if the stack starts off misaligned, it will
always be misaligned, with a disastrous effect on performance (in
double precision). Fortunately, recent versions of glibc (on
GNU/Linux) provide a properly-aligned starting stack, but this was not
the case with a number of older versions, and we are not certain of
the situation on other operating systems. Hopefully, as time goes by
this will become less of a concern, but if you want to be paranoid you
can copy the code from FFTW's libbench2/aligned-main.c
to
guarantee alignment of your main
function (with gcc
).