Skip to content

Double free or corruption issue when make runtest or train mnist model #5282

@zagwin

Description

@zagwin

Issue summary

I can successfully build caffe, with make all, make pycaffe, make test without error.
When I make runtest, it stops immediately; When I train mnist model, it stops ealierly, and gives the same errors.

I didn't change anything, just clone, and make. I have struggled with this issue for a long time, anybody can help me find out what's it wrong? thanks

*** Error in `.build_debug/tools/caffe': double free or corruption (out): 0x0000000002119160 ***
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x777e5)[0x7f01f4ea87e5]
/lib/x86_64-linux-gnu/libc.so.6(+0x7fe0a)[0x7f01f4eb0e0a]
/lib/x86_64-linux-gnu/libc.so.6(cfree+0x4c)[0x7f01f4eb498c]
/usr/lib/x86_64-linux-gnu/libprotobuf.so.9(_ZN6google8protobuf8internal28DestroyDefaultRepeatedFieldsEv+0x1f)[0x7f01f61be8af]
/usr/lib/x86_64-linux-gnu/libprotobuf.so.9(_ZN6google8protobuf23ShutdownProtobufLibraryEv+0x8b)[0x7f01f61bdb3b]
/usr/lib/x86_64-linux-gnu/libmirprotobuf.so.3(+0x20329)[0x7f01d04fd329]
/lib64/ld-linux-x86-64.so.2(+0x10c17)[0x7f01f85e8c17]
/lib/x86_64-linux-gnu/libc.so.6(+0x39ff8)[0x7f01f4e6aff8]
/lib/x86_64-linux-gnu/libc.so.6(+0x3a045)[0x7f01f4e6b045]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf7)[0x7f01f4e51837]
.build_debug/tools/caffe[0x426dd9]

I attached my Makefile.config at
Makefile.config.pdf

I also attached the full debug output for record.
debug output.pdf

Your system configuration

Operating system: Ubuntu 16.04 Desktop
Compiler: gcc
CUDA version (if applicable): 8.0
CUDNN version (if applicable): 5.1
BLAS: atlas
Python or MATLAB version (for pycaffe and matcaffe respectively): anaconda python 2.7

Best,
Weldon

Activity

shelhamer

shelhamer commented on Apr 14, 2017

@shelhamer
Member

Sorry, this seems to be a system issue. Please ask installation questions on the mailing list.

From https://github.com/BVLC/caffe/blob/master/CONTRIBUTING.md:

Please do not post usage, installation, or modeling questions, or other requests for help to Issues.
Use the caffe-users list instead. This helps developers maintain a clear, uncluttered, and efficient view of the state of Caffe.

cailile

cailile commented on Apr 24, 2017

@cailile

Hi, I encounter this problem recently when running the caffe/ssd branch. The cause turned out to be that caffe has simultaneously linked to libprotobuf.so and libprotobuf-lite.so, which double free allocated memory. You may check whether you have this double-link problem by checking the libraries that the built caffe has linked to by typing:

ldd caffe | grep proto

In my case, the caffe has simultaneously linked to libprotobuf.so.10, libprotobuf-lite.so.10 and libmirprotobuf.so.3, and the latter two were originally linked to opencv_highgui. By removing the opencv's highgui library from caffe's makefile and the involved functions in the source files, the problem was gone.

Hope this helps and good luck!

jmuncaster

jmuncaster commented on Jun 4, 2017

@jmuncaster

@cailile thank you for your comment, I encountered this problem recently and you helped me to fix it. The GTK build of opencv_highgui was responsible for bringing in libprotobuf-lite.so. The fix that I did, which does not require changing the source code, was to rebuild OpenCV against Qt5 instead of GTK, and rebuild caffe. On Ubuntu 16.04 the qt5 package is "qt5-default" and the OpenCV cmake option is WITH_QT.

jontitalukdar

jontitalukdar commented on Jun 13, 2017

@jontitalukdar

@cailile I have encountered the exact same problem during installing caffe/ssd branch as mentioned here. However, the solution you directed is a bit unclear and it would really help if you could elaborate more on how you solved it. Thanks a lot.

cailile

cailile commented on Jun 15, 2017

@cailile
cailile

cailile commented on Jun 15, 2017

@cailile

@jontitalukdar Here are some more comments. The solution I currently adopt is to roll back to Ubuntu 14.04, because simply excluding opencv_highgui when building caffe will only solve the problem on the caffe side. Later on when I want to import both caffe and cv2 in Python, the problem came up again. I am not sure whether there is a solution for libprotobuf and libprotobuf-lite to run together. @jmuncaster's solution is worth a try. If he post it earlier, I may not have to roll back to Ubuntu 14.04:)

jontitalukdar

jontitalukdar commented on Jun 15, 2017

@jontitalukdar

@cailile Thank you so much for your reply. You are absolutely correct, the opencv_highgui will cause problems when importing both caffe and cv2 withing the same script. Moreover, I installed opencv in a python virtual environment, which caused some further errors. Removing any one of the two, libprotobuf and libprotobuf-lite, might cause further unforeseen problems in the future.

So I tried rebuilding OpenCV using Qt5 instead of GTK as proposed by @jmuncaster , and it worked!
I cleaned the original OpenCV build and then reinstalled it with Qt5.

make clean
mkdir build
cd build/
cmake -DCMAKE_BUILD_TYPE=RELEASE -DCMAKE_INSTALL_PREFIX=/usr/local -DFORCE_VTK=ON -DWITH_TBB=ON -DWITH_V4L=ON -DWITH_QT=ON -DWITH_OPENGL=ON -DWITH_CUBLAS=ON -DCUDA_NVCC_FLAGS="-D_FORCE_INLINES" -DWITH_GDAL=ON -DWITH_XINE=ON -DBUILD_EXAMPLES=ON ..

I also added the library path of OpenCV in the Caffe Makefile.config and then reibuilt ssd/caffe using make.

LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib /usr/lib/x86_64-linux-gnu/hdf5/serial /usr/local/share/OpenCV/3rdparty/lib/

It seems to have worked for me for now. I will keep a close watch if any other discrepancies crop up, but it Works for now!
Thank you so much for your help @cailile :)

novate

novate commented on Dec 22, 2017

@novate

@caille Thank you so much for your solution. Now the problem of double free or corruption has gone. The side effect is that when we make caffe without highgui, we can't utilize things like webcam or output detections as video.
@jontitalukdar Here is something I suggest: when making openCV, I strongly suggest add -D WITH_GTK=NO, without this my computer will automatically build with gtk if it can find gtk packs on computer which I don’t know why.
What’s more, I can’t install qt5-default(don’t know why, but can’t apt-get, lots of unmets), but I use qt4 instead for compiling openCV, and it works.

wishinger-li

wishinger-li commented on Dec 26, 2017

@wishinger-li

@cailile thanks for your suggestion,It worked on my computer,but,I have another problem.
The same code I used three months ago,it run smoothly.When I use it tomorrow,it run with error.
so what happens during this period?

panecho

panecho commented on Aug 9, 2018

@panecho

I solved it according to #5777.

laker-sprus

laker-sprus commented on Oct 26, 2018

@laker-sprus

Nice. Also work for the "./upgrade_net_proto_binary" abort problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @shelhamer@jmuncaster@zagwin@panecho@jontitalukdar

        Issue actions

          Double free or corruption issue when make runtest or train mnist model · Issue #5282 · BVLC/caffe