-
Notifications
You must be signed in to change notification settings - Fork 24.4k
Conv2d checks raise an error with confusing message #1472
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I've checked the code that produces this error message and it looks correct. Can you show me a snippet that throws it? |
Sorry, I can't. I don't have a snapshot of my code that gave the error. But, I solved it by using .cuda() on my tensor. |
@apaszke -- Does this help?
throws
I'm very new to |
@bkj @apaszke This error seems to appear when you call It seems that the error message is getting something wrong: should it say "expected CUDA tensor (got CPU tensor)"? I find it somewhat confusing that calling |
Will fix. Yes, |
I also came across the same issue.
It will have the error:
There is no error. |
It seems that calling |
Yes, even I'am having the same issue. Can we expect some update on this? |
I have also met the same situation.
this code snippet is the same as in the Pytorch DQN tutorial. When I run this I got:
If I run the snippet below, also got the same error message.
|
@bhattad2 & @bywbilly make sure you explicitly call |
As @varunagrawal said, adding |
@karlTUM and @varunagrawal it worked, thanks a lot! |
I have also met this issue, |
I had the same issue. While the error was definitely justified, it did raise: Note: for me it was on a Conv1D, but I don't think this changes much since they all use ConvnD in the end. |
I had the same issue, and solved thanks to @karlTUM. |
Get the same issue. When the model get a CPU tensor but it's in GPU, the error said it need a CPU tensor but get a GPU tensor. |
I am seeing this error for models when run with the JIT tracer and the default tensor type being a CUDA tensor. JITing on the CPU or running without JIT on CUDA, seems fine though. I am still trying to isolate a minimal failing example. Any debugging tips are appreciated. |
@neerajprad Cause JIT load always go to CPU. You need always an explicit move #12710 |
Is that true for EDIT: Original issue - pyro-ppl/pyro#1419. |
@bkj I tried this on v1.3.1 import torch
from torchvision.models import vgg16
x = torch.zeros((1, 3, 224, 224))
model = vgg16(pretrained=False)
model.cuda()
model(x) and it throws
so I think the error message has been improved to match what we expect. |
@neerajprad Curious were you able to construct a minimal example for |
@neerajprad I am closing this issue now because the error message has been improved and it was originally not related to JIT. Please feel free to open a new issue and link to this issue if you are able to construct a minimal example for |
* Enable some tests for complex * typo * format
… sync (pytorch#1455) (pytorch#1472) * [SWDEV-469514] hipGraphExecDestroy requires an explicit sync There is a new hip feature where they do not free hipGraph memory as soon as hipGraphExecDestroy is called. This is to support async work on the GPU. See this for more details: https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#cuda-user-objects We noticed this issue when an allreduce op inside a hipGraph hung. Essentially, ncclCommAbort was waiting for all GPU activity to finish. However, since hipGraph memory was technically still in use, we had an infinite hang. So, I added an extra hipDeviceSynchronize in CUDAGraph's destructor to esure that memory is freed and got test_allreduce_in_cudagraph UT to pass. However, when I ran this on CUDA machine, I noticed that they did not require this extra sync in order to successfully run the UT. It seems that they were calling cudaGraphInstantiateWithFlags with cudaGraphInstantiateFlagAutoFreeOnLaunch, which aggressively frees memory after graph lauch. There is support for this API in our ROCm stack, but we were missing cuda to hip mappings in PyTorch. So, I brought them in and added the necesary conditions to call this API in HIP case also. * Update comments * Use USE_ROCM in keeping with convention * Use USE_ROCM to match convention --------- Co-authored-by: Jithun Nair <37884920+jithunnair-amd@users.noreply.github.com> (cherry picked from commit e752b4f)
… sync (pytorch#1455) (pytorch#1472) * [SWDEV-469514] hipGraphExecDestroy requires an explicit sync There is a new hip feature where they do not free hipGraph memory as soon as hipGraphExecDestroy is called. This is to support async work on the GPU. See this for more details: https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#cuda-user-objects We noticed this issue when an allreduce op inside a hipGraph hung. Essentially, ncclCommAbort was waiting for all GPU activity to finish. However, since hipGraph memory was technically still in use, we had an infinite hang. So, I added an extra hipDeviceSynchronize in CUDAGraph's destructor to esure that memory is freed and got test_allreduce_in_cudagraph UT to pass. However, when I ran this on CUDA machine, I noticed that they did not require this extra sync in order to successfully run the UT. It seems that they were calling cudaGraphInstantiateWithFlags with cudaGraphInstantiateFlagAutoFreeOnLaunch, which aggressively frees memory after graph lauch. There is support for this API in our ROCm stack, but we were missing cuda to hip mappings in PyTorch. So, I brought them in and added the necesary conditions to call this API in HIP case also. * Update comments * Use USE_ROCM in keeping with convention * Use USE_ROCM to match convention --------- Co-authored-by: Jithun Nair <37884920+jithunnair-amd@users.noreply.github.com> (cherry picked from commit e752b4f) (cherry picked from commit d6b8773)
* Add script to convert MIOpen driver to ckProfiler * Fix
Uh oh!
There was an error while loading. Please reload this page.
RuntimeError: expected CPU tensor (got CUDA tensor)
It shows the above message when I give it a CPU tensor rather than a CUDA tensor. What I mean to say is, the error should have been
RuntimeError: expected CUDA tensor (got CPU tensor)
I get this on nn.parallel.data_parallel
The text was updated successfully, but these errors were encountered: