I have developed a small library of TensorFlow custom ops. These ops are programmed in C++ and make use of Eigen.
A strange thing happens when I compile my custom ops module with the -march=native
compiler option. The resulting module will consistently abort with a "pointer being freed was not allocated" error originating from free() in libsystem_malloc.dylib. If I compile without the -march=native
compiler option, then there is no crash; that is, compiling with this Bazel command:
bazel build -c opt --copt=-march=native --config=cuda //tensorflow/core/user_ops:custom_ops.so
.. produces a crashing module, but this is fine:
bazel build -c opt --config=cuda //tensorflow/core/user_ops:custom_ops.so
Running a test script within lldb
and disassembling, I am seeing the exact same assembly, modulo ASLR.
So, what might be the reason why compiling with -march=native
causes a "pointer being freed was not allocated" error?
My compiler is Apple LLVM version 8.0.0 (clang-800.0.42.1) and I am running macOS 'Sierra' 10.12.4 (16E195).
UPDATE Address Sanitizer reports:
==4445==ERROR: AddressSanitizer: attempting free on address which was not malloc()-ed: 0x62a000234220 in thread T29 #0 0x1000d9db9 in wrap_free (libclang_rt.asan_osx_dynamic.dylib+0x4adb9) ... 0x62a000234220 is located 32 bytes inside of 24032-byte region [0x62a000234200,0x62a000239fe0) allocated by thread T29 here: #0 0x1000d9bf0 in wrap_malloc (libclang_rt.asan_osx_dynamic.dylib+0x4abf0) #1 0x12f4dec62 in Eigen::BDCSVD<Eigen::Matrix<double, -1, -1, 1, -1, -1> >::allocate(long, long, unsigned int) (custom_ops.so+0x26fc62) #2 0x12f4d20a1 in Eigen::BDCSVD<Eigen::Matrix<double, -1, -1, 1, -1, -1> >::compute(Eigen::Matrix<double, -1, -1, 1, -1, -1> const&, unsigned int) (custom_ops.so+0x2630a1) #3 0x12f6822d6 in Eigen::BDCSVD<Eigen::Matrix<double, -1, -1, 1, -1, -1> >::BDCSVD(Eigen::Matrix<double, -1, -1, 1, -1, -1> const&, unsigned int) (custom_ops.so+0x4132d6) ...