Skip to content

SyntaxNet fails to build with GPU support #248

Closed
@nryant

Description

@nryant

I've been trying for over a day to get SyntaxNet to build with GPU support, and while every attempt passes all tests, invariably the version of TensorFlow that it compiles lacks GPU support:

ldd models/syntaxnet/bazel-bin/syntaxnet/parser_trainer.runfiles/external/org_tensorflow/ensorflow/python/_pywrap_tensorflow.so
    linux-vdso.so.1 =>  (0x00007ffc2cbd6000)
    libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f1ba0e88000)
    libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f1ba0b82000)
    libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f1ba0964000)
    libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f1ba05e8000)
    libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f1ba03d1000)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f1ba000c000)
    /lib64/ld-linux-x86-64.so.2 (0x00007f1ba2f7a000)

I've done this with both the current version of SyntaxNet (a4b7bb9) and also the original release (32ab5a5) with the following system setup:

  • Ubuntu 14.0.4 LTS
  • TITAN X
  • CUDA 7.5
  • cuDNN v4
  • g++ 4.8.4
  • bazel 0.2.2b
  • Python 2.7.10

NOTE that I've never had trouble compiling TensorFlow separately. Has anyone experienced similar issues recently?

Activity

flashxing

flashxing commented on Aug 1, 2016

@flashxing

I have the same problem with you. Have you solved this?

todtom

todtom commented on Aug 21, 2016

@todtom

@nryant Hi, I have the same problem, could you tell me how to solve this?

David-Ba

David-Ba commented on Aug 22, 2016

@David-Ba

Hi, I had the same problem and managed to build SyntaxNet with GPU support with the following steps:

  1. Make sure you have the following environment variables set:
    CUDA_HOME="[path_to_cuda_top_directory]" LD_LIBRARY_PATH="[path_to_cuda_lib64_directory] :$LD_LIBRARY_PATH" PATH="[path_to_cuda_bin_directory]:$PATH"
  2. Add the line build --config=cuda to tools/bazel.rc
  3. Add the line cxx_builtin_include_directory: "/usr/local/cuda-7.5/targets/x86_64-linux/include” to tensorflow/third_party/gpus/crosstool/CROSSTOOL (with the cuda part pointing to your Cuda installation)
  4. Force Tensorflow to use Cuda by changing the //conditions:default part in syntaxnet/syntaxnet.bzl from if_false to if_true
  5. Do the same thing for tensorflow/third_party/gpus/cuda/build_defs.bzl
  6. Build SyntaxNet using this command: bazel test -c opt --config=cuda --define using_cuda_nvcc=true --define using_gcudacc=true syntaxnet/... util/utf8/...

Two tests will fail because SyntaxNet cannot find the Cuda dependencies for some reason (cf. test logs). It seems that the LD_LIBRARY_PATH variable is not set in the test environment. When running the parser_eval and parser_trainer script, however, it should be no problem. Running SyntaxNet on the example in this stage might cause a CUDA_OUT_OF_MEMORY error. A fix for this is available here: #173

Side note: I used Ubuntu 14.04, Cuda 7.5, and cuDNN 4.0.7

todtom

todtom commented on Aug 22, 2016

@todtom

@David-Ba I'm not sure why the bazel.rc set crosstool_top to //third_party/gpus/crosstool, maybe the first line of tools/bazel.rc need to be modified like //tensorflow/third_party/gpus/crosstool and followed your 6 steps and some additional error occured.

command is ~/tools/tensorflow/models/syntaxnet$ bazel test -c opt --config=cuda --define using_cuda_nvcc=true --define using_gcudacc=true syntaxnet/... util/utf8/...
and show these messages

INFO: Found 68 targets and 17 test targets...
INFO: From Compiling external/org_tensorflow/tensorflow/core/kernels/spacetodepth_op_gpu.cu.cc:
nvcc warning : option '--relaxed-constexpr' has been deprecated and replaced by option '--expt-relaxed-constexpr'.
nvcc warning : option '--relaxed-constexpr' has been deprecated and replaced by option '--expt-relaxed-constexpr'.
/usr/include/string.h: In function 'void* __mempcpy_inline(void*, const void*, size_t)':
/usr/include/string.h:652:42: error: 'memcpy' was not declared in this scope
   return (char *) memcpy (__dest, __src, __n) + __n;
                                          ^
ERROR: /home/hjm/.cache/bazel/_bazel_hjm/1e0c52c2d9671225fb0df00406e3d29b/external/org_tensorflow/tensorflow/core/kernels/BUILD:1445:1: output 'external/org_tensorflow/tensorflow/core/kernels/_objs/depth_space_ops_gpu/external/org_tensorflow/tensorflow/core/kernels/spacetodepth_op_gpu.cu.pic.o' was not created.
ERROR: /home/hjm/.cache/bazel/_bazel_hjm/1e0c52c2d9671225fb0df00406e3d29b/external/org_tensorflow/tensorflow/core/kernels/BUILD:1445:1: not all outputs were created.
INFO: Elapsed time: 33.099s, Critical Path: 32.81s
//syntaxnet:arc_standard_transitions_test                             NO STATUS
//syntaxnet:beam_reader_ops_test                                      NO STATUS
//syntaxnet:binary_segment_state_test                                 NO STATUS
//syntaxnet:char_properties_test                                      NO STATUS
//syntaxnet:graph_builder_test                                        NO STATUS
//syntaxnet:lexicon_builder_test                                      NO STATUS
//syntaxnet:morphology_label_set_test                                 NO STATUS
//syntaxnet:parser_features_test                                      NO STATUS
//syntaxnet:parser_trainer_test                                       NO STATUS
//syntaxnet:reader_ops_test                                           NO STATUS
//syntaxnet:segmenter_utils_test                                      NO STATUS
//syntaxnet:sentence_features_test                                    NO STATUS
//syntaxnet:shared_store_test                                         NO STATUS
//syntaxnet:tagger_transitions_test                                   NO STATUS
//syntaxnet:text_formats_test                                         NO STATUS
//util/utf8:unicodetext_unittest                                      NO STATUS

 Executed 0 out of 17 tests: 1 fails to build and 16 were skipped.

I'm new to tensorflow, I only want to get the parsed tree faster with using gpus . I'm sincerIy sorry if there are some silly questions.

I used Ubuntu 16.04, Cuda 7.5, and cuDNN 4.0.7, Geforce GTX TITANX

btw. Syntaxnet was running successfully on cpus, but too slow. And some experiments coded with theano worked well on GPUS .

David-Ba

David-Ba commented on Aug 22, 2016

@David-Ba

@todtom Yes, I set crosstool_top in tools/bazel.rc to cuda --crosstool_top=@org_tensorflow//third_party/gpus/crosstool. I forgot to mention that. Also, I am not sure whether this is the way to go. I just looked around the config files and changed them to what I thought is right. However, I have not encountered your error so far. Maybe do a bazel clean and then rebuild. It helps sometimes.

todtom

todtom commented on Aug 22, 2016

@todtom

bazel clean seems not working for me. Can anyone help me?

calberti

calberti commented on Aug 25, 2016

@calberti
Contributor

Thanks @David-Ba for your detailed answer!
@todtom: the issue running bazel clean seems unrelated to GPU support. Can you open a new issue or ask on stack overflow to get more help if needed?

chrhad

chrhad commented on Oct 22, 2016

@chrhad

I have followed the 6 steps provided by @David-Ba as follows:

  1. Make sure you have the following environment variables set:
    CUDA_HOME="[path_to_cuda_top_directory]" LD_LIBRARY_PATH="[path_to_cuda_lib64_directory] :$LD_LIBRARY_PATH" PATH="[path_to_cuda_bin_directory]:$PATH"
  2. Add the line build --config=cuda to tools/bazel.rc
  3. Add the line cxx_builtin_include_directory: "/usr/local/cuda-7.5/targets/x86_64-linux/include” to tensorflow/third_party/gpus/crosstool/CROSSTOOL (with the cuda part pointing to your Cuda installation)
  4. Force Tensorflow to use Cuda by changing the //conditions:default part in syntaxnet/syntaxnet.bzl from if_false to if_true
  5. Do the same thing for tensorflow/third_party/gpus/cuda/build_defs.bzl
  6. Build SyntaxNet using this command: bazel test -c opt --config=cuda --define using_cuda_nvcc=true --define using_gcudacc=true syntaxnet/... util/utf8/...

and set crosstool_top in tools/bazel.rc to build:cuda --crosstool_top=@org_tensorflow//third_party/gpus/crosstool

Yet, the installation returns error as follows:
ERROR: no such target '@org_tensorflow//third_party/gpus/crosstool:crosstool': target 'crosstool' not declared in package 'third_party/gpus/crosstool' defined by /home/christian/.cache/bazel/_bazel_christian/d9875fd54a23cac839e874ac491a28bb/external/org_tensorflow/third_party/gpus/crosstool/BUILD.

Reverting the crostool_top back to build:cuda --crosstool_top=//third_party/gpus/crosstool returns the following error:
ERROR: no such package 'third_party/gpus/crosstool': BUILD file not found on package path.

Have I missed anything? My CUDA version is 7.0, with CUDNN version 4.0.7.

hfxunlp

hfxunlp commented on Nov 14, 2016

@hfxunlp

ERROR:no such package 'third_party/gpus/crosstool': BUILD file not found on package path.

19 remaining items

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

    Development

    No branches or pull requests

      Participants

      @aselle@nryant@Vimos@calberti@todtom

      Issue actions

        SyntaxNet fails to build with GPU support · Issue #248 · tensorflow/models