System: 1 GHz ARM Cortex A8 (Beagleboard XM)
Compiler: gcc 4.4.5 (native)
Configure settings: --enable-single --enable-neon --enable-perf-events ARM_CPU_TYPE=cortex-a8
Comments: This is why NEON support matters: 5-8 times the performance on a wide range of FFT sizes. Peak performance is typically 600 MF to 1 GF. These results were obtained without using the fused multiply-add instructions of the NEON FPU; see here for an identical set of benchmarks with FMA enabled (and 5-10% lower performance).
Detailed NEON timing data for these cases
Copyright © 2010-11 Vesperix Corporation