High Performance Kernels for the Fast Fourier Transform#
Hpk provides a library for computing the FFT in one or more dimensions of either real or complex data in half, single, or double precision.
Overview#
Modern C++ – Hpk is designed for C++ developers. Functions and types are architected to allow most common errors to be detected at compilation time, and FFT compute objects are immutable, thread-safe, fully initialized at construction, and managed by smart pointers.
Accuracy – The accuracy of Hpk is typically superior to that of vendor-supplied FFT libraries (e.g., 1.3X on Sapphire Rapids). For details, see our papers below.
Performance – The performance of Hpk is generally higher than that of vendor-tuned libraries (e.g., 1.3X on Sapphire Rapids and over 2X on Graviton3E). For details, see our papers below.
Python – The Python interface supports NumPy, JAX, PyTorch, and TensorFlow. Both accuracy and performance are superior to that of SciPy (e.g., 1.1X float32 accuracy, 1.2X float64 accuracy, and over 2X performance). For details, see our most recent paper below.
Documentation#
This document corresponds to release 0.4.0
of the Hpk library.
Hpk uses semantic versioning, i.e., Major.Minor.Patch
, where Major
is
incremented for incompatible API changes, Minor
for backward-compatible
additions to functionality, and Patch
for backward-compatible bug fixes or
performance enhancements.
The latest version of the documentation is available online at:
https://hpkfft.com
The documentation for your local installation may be found in the subdirectory
share/doc/hpk/html
.
For example, if the core-devel
package has been installed in /opt/libhpk0
,
the associated documentation can be accessed in a web browser using the URL:
file:///opt/libhpk0/share/doc/hpk/html/index.html
Papers#
Downloading#
This software is currently available for Linux/x86_64
on hardware supporting
AVX2 (and, optionally, AVX512) and for Linux/aarch64
on hardware supporting
SVE with 256-bit vectors.
Note the minimum GNU C library versions listed below.
For example, Debian 11 (and later) is a good choice for x86_64
, as are
Debian 12 and Amazon Linux 2023 for aarch64
.
Architecture |
libc.so.6 |
Link |
|
---|---|---|---|
|
Intel/AMD 64 |
2.31 |
|
|
ARM 64 |
2.34 |
Please see Getting Started for installation instructions and a description of the package files.
The Python Download Page
has Linux/x86_64
wheels for Python 3.11 and later.
Contact Info#
Our email address is:
support@hpkfft.com