Building#
Notes on compiling and linking applications
Macro Definitions#
In addition to the common core library, libhpk_core.so
, the Hpk libraries
are divided into ISA-specific and precision-specific shared object files.
All applications should be linked with the core library, but developers can
choose to link with any subset of the others.
Typically, one would link with all relevant libraries to allow runtime
selection based on hardware.
For example, an x86_64
application using single precision FFTs would link
with both libhpk_fft_avx2_fp32.so
and libhpk_fft_avx512_fp32.so
.
Before including the header file hpk/fft/makeFactory.hpp
, one must define
preprocessor macros to indicate which libraries will be directly linked with
the application so that makeFactory()
can use the symbols (function names)
found in those libraries.
Alternatively, an application can use a given shared library by loading it
dynamically at run time. In this case, the corresponding HPK_HAVE
macro is
not defined at compile time and the shared library is not linked at build time.
This approach may be useful for large “all-purpose” applications in which the
FFT computations are used only for optional functionality.
The example advanced/fft_3x6.cpp links the AVX2 library at build time while
the AVX512 library is dynamically loaded at run time using dlopen()
.
The behavior of makeFactory()
is controlled by defining the following
preprocessor macros:
Macro |
Effect of definition |
---|---|
|
Use symbols in |
|
Use symbols in |
|
Use symbols in |
|
Use symbols in |
|
Use symbols in |
|
Use symbols in |
|
Use symbols in |
|
Use symbols in |
|
Use symbols in |
|
Do not use |
If one of the HPK_HAVE
macros is defined and the corresponding shared object
is not provided on the link command line, then ld
will report an undefined
reference and fail to link the application. Contrariwise, if one of the
libhpk_fft
shared objects is provided on the link command line and the
corresponding HPK_HAVE
macro is not defined at compile time, then ld
will do nothing with the shared object, for the application references none
of its symbols.
Defining HPK_FFT_NDLSYM
prevents dynamic lookup of symbol names at run time.
Of course, if the library symbols are accessed directly (by defining the
appropriate HPK_HAVE
macro) then dlsym()
will not be called anyway, and so
generally it is not recommended to define this macro.
Nevertheless, the example advanced/fft_12.cpp shows how to do it.
To enable OpenMP parallelism, define HPK_HAVE_FFT_OMP
and link with the
appropriate Hpk *omp.so
.
Note that these libraries have a dependency on a corresponding (externally
available) OpenMP library.
On x86_64
systems, the library libhpk_fft_iomp.so
needs libiomp5.so
(Intel’s OpenMP library), and on aarch64
, the library libhpk_fft_omp.so
needs libomp.so
(LLVM’s OpenMP library).
Compiler command line#
Linking an application with both AVX2 and AVX512 libraries allows hardware
detection to select the architecture at run time.
For example, a single precision application for x86_64
would typically want
to link with both AVX versions of the fp32
libraries and so would define the
relevant macros for compiling as follows:
-DHPK_HAVE_FFT_AVX2_FP32 -DHPK_HAVE_FFT_AVX512_FP32
and would use the following options for linking:
-lhpk_fft_avx2_fp32 -lhpk_fft_avx512_fp32 -lhpk_core -ldl
On older systems, linking with -ldl
is required, since the dlsym()
call is
present in the application regardless of whether or not it is actually called
at run time.
The function would be called, for example, to make a double precision factory
if the application were to request it.
However, as of version 2.34 of the GNU C Library, libc.so.6
has integrated
the functions for dynamic linking, so libdl.a
is empty, and it is no longer
necessary to use the link flag -ldl
.
To disable dynamic symbol lookup in the previous example, compile instead with
-DHPK_HAVE_FFT_AVX2_FP32 -DHPK_HAVE_FFT_AVX512_FP32 -DHPK_FFT_NDLSYM
and link using
-lhpk_fft_avx2_fp32 -lhpk_fft_avx512_fp32 -lhpk_core
CMake targets#
For developers using CMake, the following targets are provided on x86_64
:
CMake target |
Links with library |
---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
And, on aarch64
:
CMake target |
Links with library |
---|---|
|
|
|
|
|
|
|
|
|
|
The targets above will add the listed shared object to the link libraries
of your application and also will add the corresponding HPK_HAVE
macro to
its compile definitions.
Note that specifying one or more of the hpk::fft
targets will transitively
add the hpk::core
target since each depends upon it.
Therefore, using CMake, the first example under Compiler command line can be
reproduced with the following line in CMakeLists.txt
:
target_link_libraries(myexample PRIVATE hpk::fft_avx2_fp32
hpk::fft_avx512_fp32
${CMAKE_DL_LIBS})
Note
GNU C Library versions 2.34 and later have the dynamic linking functions
in libc.so.6
.
On these platforms, linking with ${CMAKE_DL_LIBS}
is unnecessary, but
doing so is harmless since libdl.a
is empty.
In this and earlier versions of Hpk, the provided CMake configuration file
automatically adds -ldl
as a link flag for target hpk::core
(which
itself may be explicit or implicit).
In the future, Hpk versions 0.6.0
and later will no longer add -ldl
.
Therefore, developers on older platforms will have to add ${CMAKE_DL_LIBS}
explicitly to target_link_libraries()
as shown above.
To disable dynamic symbol lookup, define HPK_FFT_NDLSYM
for the application
target (e.g., myexample
) as follows:
target_compile_definitions(myexample PRIVATE HPK_FFT_NDLSYM)
This adds -DHPK_FFT_NDLSYM
to the flags used for compiling myexample
.
Please see C++ Examples for more information, noting the example
CMakeLists.txt
configuration file provided there.
Note
In this and earlier versions of Hpk, -ldl
is automatically added as a
link flag for target hpk::core
.
To disable this and also add -DHPK_FFT_NDLSYM
as a compile flag, the
target property HPK_FFT_NDLSYM
can be defined as follows:
set_property(TARGET myexample PROPERTY HPK_FFT_NDLSYM True)
Having set this property, target hpk::core
in the application’s link
libraries adds the compile flag -DHPK_FFT_NDLSYM
and adds the link flag
-lhpk_core
, but omits adding -ldl
.
On systems using GNU C Library versions 2.34 and later, -ldl
is harmless,
since it links with libdl.a
, which is empty.
In version 0.6.0
of the Hpk library, setting property HPK_FFT_NDLSYM
to True
will be equivalent to defining HPK_FFT_NDLSYM
.
For readability, target_compile_definitions()
is preferred over
set_property()
.
Meson dependencies#
For developers using Meson, the following pkg-config modules are provided
on x86_64
:
Module |
Links with library |
---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
And, on aarch64
:
Module |
Links with library |
---|---|
|
|
|
|
|
|
|
|
|
|
Specifying a module from above as one of the dependencies will add the listed
shared object to the link libraries of your application and also will add the
corresponding HPK_HAVE
macro to its compile flags.
Note that specifying one or more of the hpk_fft
dependencies will transitively
add hpk_core
as a dependency since each of the FFT modules depends upon it.
Therefore, using Meson, the first example under Compiler command line, which
uses single precision, can be reproduced with the following lines in
meson.build
:
hpk_fft_avx2 = dependency('hpk_fft_avx2_fp32')
hpk_fft_avx512 = dependency('hpk_fft_avx512_fp32')
dl_dep = dependency('dl')
executable('myexample', 'example.cpp',
dependencies: [hpk_fft_avx2, hpk_fft_avx512, dl_dep])
Note
GNU C Library versions 2.34 and later have the dynamic linking functions
in libc.so.6
, and the linker flag -ldl
links with libdl.a
, which is
empty.
Therefore, on these systems, the dependency dl_dep
is unnecessary (but
is harmless).
To disable dynamic symbol lookup, add -DHPK_FFT_NDLSYM
to the flags used for
compiling myexample
as follows:
hpk_fft_avx2 = dependency('hpk_fft_avx2_fp32')
hpk_fft_avx512 = dependency('hpk_fft_avx512_fp32')
executable('myexample', 'example.cpp',
cpp_args: '-DHPK_FFT_NDLSYM',
dependencies: [hpk_fft_avx2, hpk_fft_avx512])
Please see C++ Examples for more information, noting the example
meson.build
configuration file provided there.