用tensorflow在GPU上跑一个fit,模型里用到了两个卷积函数layers.Conv2D和layers.Conv2DTranspose
epoch时出错,具体错误日志见下文,一开始Conv2D位置报错
(1)搜索到一个类似的:No algorithm worked!,即代码开头加上以下设置
perimental import list_physical_devices, set_memory_growth
physical_devices = list_physical_devices('GPU')
set_memory_growth(physical_devices[0], True)
(2)现在Conv2DTranspose位置报错,据错误日志推测大概是调用cudnn跑卷积转置失败,使用错误日志搜索无果后,使用关键词“cudnn运行反卷积gpu失败”,同样参考类似问题:Could not load library cudnn_cnn_infer64_8.dll,可能是cudnn的底层cnn计算库本身有问题,学博主的降一个cudnn版本
pip install -i tensorflow
这里临时使用了清华镜像站作为源地址,因为官方源地址下载非常慢# 查看pip当前设置的源地址
pip config list
# 升级pip
python -m pip install --upgrade pip
# 默认使用清华镜像站,详情网址/
pip config set global.index-url
# 官方源站
import os
os.environ["CUDA_VISIBLE_DEVICES"]="-1"
错误日志
NotFoundError: Graph execution error: Detected at node ‘gradient_tape/denoise/sequential_1/conv2d_transpose_1/conv2d_transpose/Conv2DBackpropFilter’ defined at (most recent call last):
…
Node: ‘gradient_tape/denoise/sequential_1/conv2d_transpose_1/conv2d_transpose/Conv2DBackpropFilter’
No algorithm worked! Error messages:
Profiling failure on CUDNN engine 1#TC: UNKNOWN: CUDNN_STATUS_INTERNAL_ERROR
in tensorflow/stream_executor/cuda/cuda_dnn(4177): ‘cudnnConvolutionBackwardFilter( cudnn.handle(), alpha, input_nd_.handle(), input_data.opaque(), output_nd_.handle(), output_data.opaque(), conv_.handle(), ToConvBackwardFilterAlgo(algo), scratch_memory.opaque(), scratch_memory.size(), beta, filter_.handle(), filter_data.opaque())’
Profiling failure on CUDNN engine 1: UNKNOWN: CUDNN_STATUS_INTERNAL_ERROR
in tensorflow/stream_executor/cuda/cuda_dnn(4177): ‘cudnnConvolutionBackwardFilter( cudnn.handle(), alpha, input_nd_.handle(), input_data.opaque(), output_nd_.handle(), output_data.opaque(), conv_.handle(), ToConvBackwardFilterAlgo(algo), scratch_memory.opaque(), scratch_memory.size(), beta, filter_.handle(), filter_data.opaque())’
Profiling failure on CUDNN engine 0#TC: UNKNOWN: CUDNN_STATUS_EXECUTION_FAILED
in tensorflow/stream_executor/cuda/cuda_dnn(4177): ‘cudnnConvolutionBackwardFilter( cudnn.handle(), alpha, input_nd_.handle(), input_data.opaque(), output_nd_.handle(), output_data.opaque(), conv_.handle(), ToConvBackwardFilterAlgo(algo), scratch_memory.opaque(), scratch_memory.size(), beta, filter_.handle(), filter_data.opaque())’
Profiling failure on CUDNN engine 0: UNKNOWN: CUDNN_STATUS_EXECUTION_FAILED
in tensorflow/stream_executor/cuda/cuda_dnn(4177): ‘cudnnConvolutionBackwardFilter( cudnn.handle(), alpha, input_nd_.handle(), input_data.opaque(), output_nd_.handle(), output_data.opaque(), conv_.handle(), ToConvBackwardFilterAlgo(algo), scratch_memory.opaque(), scratch_memory.size(), beta, filter_.handle(), filter_data.opaque())’
Profiling failure on CUDNN engine 3#TC: UNKNOWN: CUDNN_STATUS_INTERNAL_ERROR
in tensorflow/stream_executor/cuda/cuda_dnn(4177): ‘cudnnConvolutionBackwardFilter( cudnn.handle(), alpha, input_nd_.handle(), input_data.opaque(), output_nd_.handle(), output_data.opaque(), conv_.handle(), ToConvBackwardFilterAlgo(algo), scratch_memory.opaque(), scratch_memory.size(), beta, filter_.handle(), filter_data.opaque())’
Profiling failure on CUDNN engine 3: UNKNOWN: CUDNN_STATUS_INTERNAL_ERROR
in tensorflow/stream_executor/cuda/cuda_dnn(4177): ‘cudnnConvolutionBackwardFilter( cudnn.handle(), alpha, input_nd_.handle(), input_data.opaque(), output_nd_.handle(), output_data.opaque(), conv_.handle(), ToConvBackwardFilterAlgo(algo), scratch_memory.opaque(), scratch_memory.size(), beta, filter_.handle(), filter_data.opaque())’
[[{{node gradient_tape/denoise/sequential_1/conv2d_transpose_1/conv2d_transpose/Conv2DBackpropFilter}}]] [Op:__inference_train_function_973]
本文发布于:2024-02-02 18:54:54,感谢您对本站的认可!
本文链接:https://www.4u4v.net/it/170687130645769.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
留言与评论(共有 0 条评论) |