Closed
Description
We are trying to build Tensorflow Serving 0.5.1 with TensorFlow 1.0.0@07bb8ea
Basing on CUDA 7.5, cuDNN 5.
Bazel 0.4.4
cd serving && bazel build -c opt --config=cuda tensorflow_serving/...
ERROR: /root/.cache/bazel/_bazel_root/f8d1071c69ea316497c31e40fe0160
8c/external/org_tensorflow/tensorflow/contrib/nccl/BUILD:23:1: C++ c
ompilation of rule '@org_tensorflow//tensorflow/contrib/nccl:python/
ops/_nccl_ops.so' failed: crosstool_wrapper_driver_is_not_gcc failed
: error executing command external/local_config_cuda/crosstool/clang
/bin/crosstool_wrapper_driver_is_not_gcc -U_FORTIFY_SOURCE '-D_FORTI
FY_SOURCE=1' -fstack-protector -fPIE -Wall -Wunused-but-set-paramete
r ... (remaining 76 argument(s) skipped): com.google.devtools.build.
lib.shell.BadExitStatusException: Process exited with status 1.
In file included from external/org_tensorflow/tensorflow/contrib/ncc
l/kernels/nccl_manager.cc:15:0:
external/org_tensorflow/tensorflow/contrib/nccl/kernels/nccl_manager
.h:23:44: fatal error: external/nccl_archive/src/nccl.h: No such fil
e or directory
#include "external/nccl_archive/src/nccl.h"
^
compilation terminated.
INFO: Elapsed time: 147.378s, Critical Path: 107.11s
I'm able to find nccl.h, but it can't be found during bazel build. Any suggestions? Thanks in advanced.
find / -name nccl.h
/root/.cache/bazel/_bazel_root/5071e8dca1385fb776f72b33971bf157/exte
rnal/nccl_archive/src/nccl.h
/root/.cache/bazel/_bazel_root/f8d1071c69ea316497c31e40fe01608c/exte
rnal/nccl_archive/src/nccl.h
Activity
tvkpz commentedon Feb 19, 2017
Same error here.
cuda 8.0
cudnn 5.1
bazel 4.2
ERROR: /root/.cache/bazel/_bazel_root/f8d1071c69ea316497c31e40fe01608c/external/org_tensorflow/tensorflow/contrib/nccl/BUILD:23:1: C++ compilation of rule '@org_tensorflow//tensorflow/contrib/nccl:python/ops/_nccl_ops.so' failed: crosstool_wrapper_driver_is_not_gcc failed: error executing command external/local_config_cuda/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc -U_FORTIFY_SOURCE '-D_FORTIFY_SOURCE=1' -fstack-protector -fPIE -Wall -Wunused-but-set-parameter ... (remaining 77 argument(s) skipped): com.google.devtools.build.lib.shell.BadExitStatusException: Process exited with status 1.
In file included from external/org_tensorflow/tensorflow/contrib/nccl/kernels/nccl_manager.cc:15:0:
external/org_tensorflow/tensorflow/contrib/nccl/kernels/nccl_manager.h:23:44: fatal error: external/nccl_archive/src/nccl.h: No such file or directory
compilation terminated.
Any solutions?
cheyang commentedon Feb 23, 2017
@ kirilg,can you help take a quick look at this issue? Thank you.
kinhunt commentedon Feb 24, 2017
same here

jlertle commentedon Feb 24, 2017
To get around it you can comment out the DEP for nccl in: tensorflow/tensorflow/contrib/BUILD
Line 42 iirc
cheyang commentedon Feb 25, 2017
Thanks, @jlertle
sskgit commentedon Feb 25, 2017
Thanks @jlertle.
cosastro commentedon Mar 24, 2017
which line in: tensorflow/tensorflow/contrib/BUILD is the DEP for nccl? i can't find it, thanks.
perdasilva commentedon Mar 24, 2017
65: "//tensorflow/contrib/nccl:nccl_py",
I believe...
jlertle commentedon Mar 24, 2017
It was moved into a Windows check but the referenced path is still having issues resolving during Serving build process on Ubuntu. Bazel stuff.
cosastro commentedon Mar 27, 2017
I tried a script provided by #318, it works fine
skonto commentedon Apr 10, 2017
If you comment it out examples fail, I managed to built it as well but... I get
ImportError: cannot import name nccl
with a minst example.Here is the task that fails:
I verified that nccl_Archive is fetched and unzipped correctly under .cache dir and from what I see
-iquote external/nccl_archive should do the work to include all stuff needed.
skonto commentedon Apr 11, 2017
I solved it by removing the prefix /external/nccl_archive.
41 remaining items