Node:SIMD alignment and fftw_malloc, Next:Stack alignment on x86, Previous:Data Alignment, Up:Data Alignment
SIMD, which stands for "Single Instruction Multiple Data," is a set of special operations supported by some processors to perform a single operation on several numbers (usually 2 or 4) simultaneously. SIMD floating-point instructions are available on several popular CPUs: SSE/SSE2 (single/double precision) on Pentium III/IV and higher, 3DNow! (single precision) on the AMD K7 and higher, and AltiVec (single precision) on the PowerPC G4 and higher. FFTW can be compiled to support the SIMD instructions on any of these systems.
A program linking to an FFTW library compiled with SIMD support can
obtain a nonnegligible speedup for most complex and r2c/c2r
transforms. In order to obtain this speedup, however, the arrays of
complex (or real) data passed to FFTW must be specially aligned in
memory (typically 16-byte aligned), and often this alignment is more
stringent than that provided by the usual malloc
(etc.)
allocation routines.
In order to guarantee proper alignment for SIMD, therefore, in case
your program is ever linked against a SIMD-using FFTW, we recommend
allocating your transform data with fftw_malloc
and
de-allocating it with fftw_free
.
These have exactly the same interface and behavior as
malloc
/free
, except that for a SIMD FFTW they ensure
that the returned pointer has the necessary alignment (by calling
memalign
or its equivalent on your OS).
You are not required to use fftw_malloc
. You can
allocate your data in any way that you like, from malloc
to
new
(in C++) to a static array declaration. If the array
happens not to be properly aligned, FFTW will not use the SIMD
extensions.