Node:SIMD alignment and fftw_malloc, Next:, Previous:Data Alignment, Up:Data Alignment



3.1.1 SIMD alignment and fftw_malloc

SIMD, which stands for "Single Instruction Multiple Data," is a set of special operations supported by some processors to perform a single operation on several numbers (usually 2 or 4) simultaneously. SIMD floating-point instructions are available on several popular CPUs: SSE/SSE2 (single/double precision) on Pentium III/IV and higher, 3DNow! (single precision) on the AMD K7 and higher, and AltiVec (single precision) on the PowerPC G4 and higher. FFTW can be compiled to support the SIMD instructions on any of these systems.

A program linking to an FFTW library compiled with SIMD support can obtain a nonnegligible speedup for most complex and r2c/c2r transforms. In order to obtain this speedup, however, the arrays of complex (or real) data passed to FFTW must be specially aligned in memory (typically 16-byte aligned), and often this alignment is more stringent than that provided by the usual malloc (etc.) allocation routines.

In order to guarantee proper alignment for SIMD, therefore, in case your program is ever linked against a SIMD-using FFTW, we recommend allocating your transform data with fftw_malloc and de-allocating it with fftw_free. These have exactly the same interface and behavior as malloc/free, except that for a SIMD FFTW they ensure that the returned pointer has the necessary alignment (by calling memalign or its equivalent on your OS).

You are not required to use fftw_malloc. You can allocate your data in any way that you like, from malloc to new (in C++) to a static array declaration. If the array happens not to be properly aligned, FFTW will not use the SIMD extensions.