RuntimeError: CUDA error: invalid device ordinal CUDA kernel errors might be asynchronously reported

报错：RuntimeError: CUDA error: invalid device ordinal
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

代码：

parser.add_argument('--gpu', type=int, default=3)
args = parser.parse_args()
os.environ["CUDA_VISIBLE_DEVICES"] = str(args.gpu)
torch.cuda.set_device(args.gpu)

gpu设为0运行压根没任何问题

明明有8个卡，只是gpu = 3为什么一直报这个错？？？

gpt告诉的原因：

确保你正在使用的CUDA版本与PyTorch版本兼容。有时CUDA版本与PyTorch版本不兼容会导致这种错误。
确保你的设备索引值在正确的范围内。有时候设备索引值超出了范围，会引发这个错误。
尝试使用CUDA_LAUNCH_BLOCKING=1。你可以在运行脚本之前设置这个环境变量来启用CUDA的同步错误检测。这样可以使得CUDA错误被立即报告，并且堆栈跟踪是正确的，有助于更好地进行调试。

这三个原因肯定都不是。

又问了一遍，它说：
确认你的系统中有多少个可用的CUDA设备。

所以进一步认为：总不能是代码识别不到那些卡之类的原因？那能识别几张卡？

一开始这状态是没有这左边的2的，加了下面那行代码才是下图的状态——
在这里插入图片描述

parser.add_argument('--gpu', type=int, default=3)
args = parser.parse_args()
**print(torch.cuda.device_count())**
os.environ["CUDA_VISIBLE_DEVICES"] = str(args.gpu)
torch.cuda.set_device(args.gpu)

加了一行
print(torch.cuda.device_count())看看能识别几张卡，结果

为什么又突然可以在第三张卡运行了？？

而且就像是突然又激活了这几张卡一样

具体原因仍不清楚…有大佬知道为什么吗？