Node:Stack alignment on x86, Previous:SIMD alignment and fftw_malloc, Up:Data Alignment



3.1.2 Stack alignment on x86

On the Pentium and subsequent x86 processors, there is a substantial performance penalty if double-precision variables are not stored 8-byte aligned; a factor of two or more is not unusual. Unfortunately, the stack (the place that local variables and subroutine arguments live) is not guaranteed by the Intel ABI to be 8-byte aligned.

Recent versions of gcc (as well as most other compilers, we are told, such as Intel's, Metrowerks', and Microsoft's) are able to keep the stack 8-byte aligned; gcc does this by default (see -mpreferred-stack-boundary in the gcc documentation). If you are not certain whether your compiler maintains stack alignment by default, it is a good idea to make sure.

Unfortunately, gcc only preserves the stack alignment--as a result, if the stack starts off misaligned, it will always be misaligned, with a disastrous effect on performance (in double precision). Fortunately, recent versions of glibc (on GNU/Linux) provide a properly-aligned starting stack, but this was not the case with a number of older versions, and we are not certain of the situation on other operating systems. Hopefully, as time goes by this will become less of a concern, but if you want to be paranoid you can copy the code from FFTW's libbench2/aligned-main.c to guarantee alignment of your main function (with gcc).