Building#

Notes on compiling and linking applications

Macro Definitions#

In addition to the common core library, libhpk_core.so, the Hpk libraries are divided into ISA-specific and precision-specific shared object files. All applications should be linked with the core library, but developers can choose to link with any subset of the others. Typically, one would link with all relevant libraries to allow runtime selection based on hardware. For example, an x86_64 application using single precision FFTs would link with both libhpk_fft_avx2_fp32.so and libhpk_fft_avx512_fp32.so.

Before including the header file hpk/fft/makeFactory.hpp, one must define preprocessor macros to indicate which libraries will be directly linked with the application so that makeFactory() can use the symbols (function names) found in those libraries.

Alternatively, an application can use a given shared library by loading it dynamically at run time. In this case, the corresponding HPK_HAVE macro is not defined at compile time and the shared library is not linked at build time. This approach may be useful for large “all-purpose” applications in which the FFT computations are used only for optional functionality. The example advanced/fft_3x6.cpp links the AVX2 library at build time while the AVX512 library is dynamically loaded at run time using dlopen().

The behavior of makeFactory() is controlled by defining the following preprocessor macros:

Macro

Effect of definition

HPK_HAVE_FFT_AVX2_FP32

Use symbols in libhpk_fft_avx2_fp32.so

HPK_HAVE_FFT_AVX2_FP64

Use symbols in libhpk_fft_avx2_fp64.so

HPK_HAVE_FFT_AVX512_FP16

Use symbols in libhpk_fft_avx512_fp16.so

HPK_HAVE_FFT_AVX512_FP32

Use symbols in libhpk_fft_avx512_fp32.so

HPK_HAVE_FFT_AVX512_FP64

Use symbols in libhpk_fft_avx512_fp64.so

HPK_HAVE_FFT_SVE256_FP16

Use symbols in libhpk_fft_sve256_fp16.so

HPK_HAVE_FFT_SVE256_FP32

Use symbols in libhpk_fft_sve256_fp32.so

HPK_HAVE_FFT_SVE256_FP64

Use symbols in libhpk_fft_sve256_fp64.so

HPK_HAVE_FFT_OMP

Use symbols in libhpk_fft_*omp.so

HPK_FFT_NDLSYM

Do not use dlsym() to locate symbols

If one of the HPK_HAVE macros is defined and the corresponding shared object is not provided on the link command line, then ld will report an undefined reference and fail to link the application. Contrariwise, if one of the libhpk_fft shared objects is provided on the link command line and the corresponding HPK_HAVE macro is not defined at compile time, then ld will do nothing with the shared object, for the application references none of its symbols.

Defining HPK_FFT_NDLSYM prevents dynamic lookup of symbol names at run time. Of course, if the library symbols are accessed directly (by defining the appropriate HPK_HAVE macro) then dlsym() will not be called anyway, and so generally it is not recommended to define this macro. Nevertheless, the example advanced/fft_12.cpp shows how to do it.

To enable OpenMP parallelism, define HPK_HAVE_FFT_OMP and link both with an Hpk *omp.so library and also with the corresponding (externally available) OpenMP library. For example, on x86_64 link with both libhpk_fft_iomp.so and Intel’s OpenMP library libiomp5.so, and on aarch64 platforms, libhpk_fft_omp.so and the LLVM runtime library libomp.so.

Compiler command line#

Linking an application with both AVX2 and AVX512 libraries allows hardware detection to select the architecture at run time. For example, a single precision application for x86_64 would typically want to link with both AVX versions of the fp32 libraries and so would define the relevant macros for compiling as follows:

-DHPK_HAVE_FFT_AVX2_FP32 -DHPK_HAVE_FFT_AVX512_FP32

and would use the following options for linking:

-lhpk_fft_avx2_fp32 -lhpk_fft_avx512_fp32 -lhpk_core -ldl

Linking with -ldl is required since the dlsym() call is present in the binary whether or not it is actually called at run time. It would be called, for example, to make a double precision factory, if such were done by this application. (Actually, as of version 2.34, the GNU C Library, libc, has integrated libdl, so applications do not need to link with -ldl anymore. Doing so still works, but libdl is empty.)

To disable dynamic symbol lookup in the previous example, compile instead with

-DHPK_HAVE_FFT_AVX2_FP32 -DHPK_HAVE_FFT_AVX512_FP32 -DHPK_FFT_NDLSYM

and link using

-lhpk_fft_avx2_fp32 -lhpk_fft_avx512_fp32 -lhpk_core

CMake targets#

For developers using CMake, the following targets are provided on x86_64:

CMake target

Links with library

hpk::core

libhpk_core.so, libdl.so

hpk::fft_avx2_fp32

libhpk_fft_avx2_fp32.so

hpk::fft_avx2_fp64

libhpk_fft_avx2_fp64.so

hpk::fft_avx512_fp16

libhpk_fft_avx512_fp16.so

hpk::fft_avx512_fp32

libhpk_fft_avx512_fp32.so

hpk::fft_avx512_fp64

libhpk_fft_avx512_fp64.so

hpk::fft_iomp

libhpk_fft_iomp.so, libiomp5.so

And, on aarch64:

CMake target

Links with library

hpk::core

libhpk_core.so, libdl.so

hpk::fft_sve256_fp16

libhpk_fft_sve256_fp16.so

hpk::fft_sve256_fp32

libhpk_fft_sve256_fp32.so

hpk::fft_sve256_fp64

libhpk_fft_sve256_fp64.so

hpk::fft_omp

libhpk_fft_omp.so, libomp.so

The targets above will add the listed shared object(s) to the link libraries of your application and also add the corresponding HPK_HAVE macro to its compile definitions. Note also that specifying one or more of the hpk::fft targets will transitively add the hpk::core target since each depends upon it.

Therefore, using CMake, the first example from above is reproduced by the following line in CMakeLists.txt:

target_link_libraries(myexample PRIVATE hpk::fft_avx2_fp32
                                        hpk::fft_avx512_fp32)

To disable dynamic symbol lookup, a target property is provided, and it is set on an application target (e.g., myexample) as follows:

set_property(TARGET myexample PROPERTY HPK_FFT_NDLSYM True)

Having set this property, the target hpk::core in the application’s link libraries (set either explicitly or transitively) adds -DHPK_FFT_NDLSYM to the flags for compiling and adds -lhpk_core, but not -ldl, to the flags for linking. Thus, we have reproduced the second example from above.

Please see C++ Examples for more information, noting the CMakeLists.txt configuration file provided there.