
上QQ阅读APP看书,第一时间看更新
How to do it
According to the documentation of the Eigen library, it is sufficient to set the appropriate compiler flag to enable the generation of vectorized code. Let us look at CMakeLists.txt:
- We declare a C++11 project:
cmake_minimum_required(VERSION 3.5 FATAL_ERROR)
project(recipe-06 LANGUAGES CXX)
set(CMAKE_CXX_STANDARD 11)
set(CMAKE_CXX_EXTENSIONS OFF)
set(CMAKE_CXX_STANDARD_REQUIRED ON)
- Since we wish to use the Eigen library, we need to find its header files on the system:
find_package(Eigen3 3.3 REQUIRED CONFIG)
- We include the CheckCXXCompilerFlag.cmake standard module file:
include(CheckCXXCompilerFlag)
- We check that the -march=native compiler flag works:
check_cxx_compiler_flag("-march=native" _march_native_works)
- The alternative -xHost compiler flag is also checked:
check_cxx_compiler_flag("-xHost" _xhost_works)
- We set an empty variable, _CXX_FLAGS, to hold the one compiler flag that was found to work among the two we just checked. If we see _march_native_works, we set_CXX_FLAGS to -march=native. If we see _xhost_works, we set_CXX_FLAGS to -xHost. If none of them worked, we will leave _CXX_FLAGS empty and vectorization will be disabled:
set(_CXX_FLAGS)
if(_march_native_works)
message(STATUS "Using processor's vector instructions (-march=native compiler flag set)")
set(_CXX_FLAGS "-march=native")
elseif(_xhost_works)
message(STATUS "Using processor's vector instructions (-xHost compiler flag set)")
set(_CXX_FLAGS "-xHost")
else()
message(STATUS "No suitable compiler flag found for vectorization")
endif()
- For comparison, we also define an executable target for the unoptimized version where we do not use the preceding optimization flags:
add_executable(linear-algebra-unoptimized linear-algebra.cpp)
target_link_libraries(linear-algebra-unoptimized
PRIVATE
Eigen3::Eigen
)
- In addition, we define an optimized version:
add_executable(linear-algebra linear-algebra.cpp)
target_compile_options(linear-algebra
PRIVATE
${_CXX_FLAGS}
)
target_link_libraries(linear-algebra
PRIVATE
Eigen3::Eigen
)
- Let us compare the two executables—first we configure (in this case, -march=native_works):
$ mkdir -p build
$ cd build
$ cmake ..
...
-- Performing Test _march_native_works
-- Performing Test _march_native_works - Success
-- Performing Test _xhost_works
-- Performing Test _xhost_works - Failed
-- Using processor's vector instructions (-march=native compiler flag set)
...
- Finally, let us compile and compare timings:
$ cmake --build .
$ ./linear-algebra-unoptimized
result: -261.505
elapsed seconds: 1.97964
$ ./linear-algebra
result: -261.505
elapsed seconds: 1.05048