ImportError: /usr/local/lib/python3.10/dist-packages/torch/lib/../../nvidia/cusparse/lib/libcusparse.so.12: undefined symbol: __nvJitLinkAddData_12_1, version libnvJitLink.so.12 #111469

qianxifu · 2023-10-18T10:33:48Z

🐛 Describe the bug

>>> import torch
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.10/dist-packages/torch/__init__.py", line 235, in <module>
    from torch._C import *  # noqa: F403
ImportError: /usr/local/lib/python3.10/dist-packages/torch/lib/../../nvidia/cusparse/lib/libcusparse.so.12: undefined symbol: __nvJitLinkAddData_12_1, version libnvJitLink.so.12
>>>

Versions

Collecting environment information...
PyTorch version: N/A
Is debug build: N/A
CUDA used to build PyTorch: N/A
ROCM used to build PyTorch: N/A

OS: Ubuntu 20.04.6 LTS (x86_64)
GCC version: (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0
Clang version: Could not collect
CMake version: Could not collect
Libc version: glibc-2.31

Python version: 3.10.13 (main, Aug 25 2023, 13:20:03) [GCC 9.4.0] (64-bit runtime)
Python platform: Linux-5.4.0-162-generic-x86_64-with-glibc2.31
Is CUDA available: N/A
CUDA runtime version: 12.0.140
CUDA_MODULE_LOADING set to: N/A
GPU models and configuration: GPU 0: NVIDIA A10
Nvidia driver version: 525.105.17
cuDNN version: Probably one of the following:
/usr/local/cuda-12.0/targets/x86_64-linux/lib/libcudnn.so.8.9.1
/usr/local/cuda-12.0/targets/x86_64-linux/lib/libcudnn_adv_infer.so.8.9.1
/usr/local/cuda-12.0/targets/x86_64-linux/lib/libcudnn_adv_train.so.8.9.1
/usr/local/cuda-12.0/targets/x86_64-linux/lib/libcudnn_cnn_infer.so.8.9.1
/usr/local/cuda-12.0/targets/x86_64-linux/lib/libcudnn_cnn_train.so.8.9.1
/usr/local/cuda-12.0/targets/x86_64-linux/lib/libcudnn_ops_infer.so.8.9.1
/usr/local/cuda-12.0/targets/x86_64-linux/lib/libcudnn_ops_train.so.8.9.1
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: N/A

CPU:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Address sizes: 46 bits physical, 48 bits virtual
CPU(s): 16
On-line CPU(s) list: 0-15
Thread(s) per core: 2
Core(s) per socket: 8
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 106
Model name: Intel(R) Xeon(R) Platinum 8369B CPU @ 2.90GHz
Stepping: 6
CPU MHz: 2900.000
BogoMIPS: 5800.00
Hypervisor vendor: KVM
Virtualization type: full
L1d cache: 384 KiB
L1i cache: 256 KiB
L2 cache: 10 MiB
L3 cache: 48 MiB
NUMA node0 CPU(s): 0-15
Vulnerability Gather data sampling: Unknown: Dependent on hypervisor status
Vulnerability Itlb multihit: Not affected
Vulnerability L1tf: Not affected
Vulnerability Mds: Not affected
Vulnerability Meltdown: Not affected
Vulnerability Mmio stale data: Vulnerable: Clear CPU buffers attempted, no microcode; SMT Host state unknown
Vulnerability Retbleed: Not affected
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp
Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2: Mitigation; Enhanced IBRS, IBPB conditional, RSB filling, PBRSB-eIBRS SW sequence
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort: Not affected
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid tsc_known_freq pni pclmulqdq monitor ssse3 fma cx16 pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch cpuid_fault invpcid_single ssbd ibrs ibpb stibp ibrs_enhanced fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves wbnoinvd arat avx512vbmi avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq rdpid arch_capabilities

Versions of relevant libraries:
[pip3] numpy==1.26.1
[pip3] torch==2.1.0
[pip3] torchaudio==2.1.0
[pip3] torchvision==0.16.0
[pip3] triton==2.1.0
[conda] Could not collect

--------------------------------nvidia-smi---------------------------------------------

--------------------------------cuda version---------------------------------------------

--------------------------------install torch command---------------------------------------------
pip3 install torch torchvision torchaudio

--------------------------------python lib---------------------------------------------
certifi 2019.11.28
chardet 3.0.4
command-not-found 0.3
dbus-python 1.2.16
distro 1.4.0
distro-info 0.23+ubuntu1.1
filelock 3.12.4
fsspec 2023.9.2
idna 2.8
Jinja2 3.1.2
language-selector 0.1
MarkupSafe 2.1.3
mpmath 1.3.0
netifaces 0.10.4
networkx 3.1
numpy 1.26.1
nvidia-cublas-cu12 12.1.3.1
nvidia-cuda-cupti-cu12 12.1.105
nvidia-cuda-nvrtc-cu12 12.1.105
nvidia-cuda-runtime-cu12 12.1.105
nvidia-cudnn-cu12 8.9.2.26
nvidia-cufft-cu12 11.0.2.54
nvidia-curand-cu12 10.3.2.106
nvidia-cusolver-cu12 11.4.5.107
nvidia-cusparse-cu12 12.1.0.106
nvidia-nccl-cu12 2.18.1
nvidia-nvjitlink-cu12 12.2.140
nvidia-nvtx-cu12 12.1.105
Pillow 10.1.0
pip 23.3
PyGObject 3.36.0
pymacaroons 0.13.0
PyNaCl 1.3.0
python-apt 2.0.1+ubuntu0.20.4.1
PyYAML 5.3.1
requests 2.22.0
requests-unixsocket 0.2.0
setuptools 45.2.0
six 1.14.0
ssh-import-id 5.10
sympy 1.12
torch 2.1.0
torchaudio 2.1.0
torchvision 0.16.0
triton 2.1.0
typing_extensions 4.8.0
ubuntu-advantage-tools 8001
ufw 0.36
unattended-upgrades 0.1
urllib3 1.25.8
wheel 0.34.2

The text was updated successfully, but these errors were encountered:

ptrblck · 2023-10-18T14:49:40Z

Not reproducible using pip install torch torchvision torchaudio:

pip install torch torchvision torchaudio
Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Collecting torch
  Downloading torch-2.1.0-cp310-cp310-manylinux1_x86_64.whl.metadata (25 kB)
Requirement already satisfied: torchvision in /usr/local/lib/python3.10/dist-packages (0.16.0a0)
Collecting torchaudio
  Downloading torchaudio-2.1.0-cp310-cp310-manylinux1_x86_64.whl.metadata (5.7 kB)
Requirement already satisfied: filelock in /usr/local/lib/python3.10/dist-packages (from torch) (3.12.4)
Requirement already satisfied: typing-extensions in /usr/local/lib/python3.10/dist-packages (from torch) (4.7.1)
Requirement already satisfied: sympy in /usr/local/lib/python3.10/dist-packages (from torch) (1.12)
Requirement already satisfied: networkx in /usr/local/lib/python3.10/dist-packages (from torch) (2.6.3)
Requirement already satisfied: jinja2 in /usr/local/lib/python3.10/dist-packages (from torch) (3.1.2)
Requirement already satisfied: fsspec in /usr/local/lib/python3.10/dist-packages (from torch) (2023.6.0)
Collecting nvidia-cuda-nvrtc-cu12==12.1.105 (from torch)
  Downloading nvidia_cuda_nvrtc_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (23.7 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 23.7/23.7 MB 27.2 MB/s eta 0:00:00
Collecting nvidia-cuda-runtime-cu12==12.1.105 (from torch)
  Downloading nvidia_cuda_runtime_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (823 kB)
...

# python -c "import torch; print(torch.__version__); print(torch.__path__)"
2.1.0+cu121
['/usr/local/lib/python3.10/dist-packages/torch']
# find /usr/ -name libnvJit*
/usr/local/lib/python3.10/dist-packages/nvidia/nvjitlink/lib/libnvJitLink.so.12

qianxifu · 2023-10-19T02:09:34Z

thanks for your help.

lee101 · 2023-10-19T06:15:13Z

Note that ptrblock is on cuda 12.1 and we are having this issue on cuda 12.0
Not sure if thats related but worth a try updating cuda versions

lee101 · 2023-10-19T07:09:22Z

LD_LIBRARY_PATH=/usr/local/cuda-11.7/lib64

Managed to get this error message to go away by pointing this env var back to a previous cuda version (11.7 in my case instead of 12.0), not sure what this was about, also works if i just unset that variable so i'm not sure if we need to set that up with cuda 12.0 or not.

-Lee https://text-generator.io

giovannibonisoli · 2023-10-19T09:15:05Z

I had the same problem of this issue:
File "/home/user/.conda/envs/myenv/bin/torchrun", line 5, in <module> from torch.distributed.run import main File "/home/user/.conda/envs/myenv/lib/python3.8/site-packages/torch/__init__.py", line 235, in <module> from torch._C import * # noqa: F403 ImportError: /home/user/.conda/envs/myenv/lib/python3.8/site-packages/torch/lib/../../nvidia/cusparse/lib/libcusparse.so.12: undefined symbol: __nvJitLinkAddData_12_1, version libnvJitLink.so.12

I tried the solution suggested by @ptrblck and the error is not fixed, yet!

panpan0000 · 2023-10-20T03:58:49Z

I had the same problem of this issue: File "/home/user/.conda/envs/myenv/bin/torchrun", line 5, in <module> from torch.distributed.run import main File "/home/user/.conda/envs/myenv/lib/python3.8/site-packages/torch/__init__.py", line 235, in <module> from torch._C import * # noqa: F403 ImportError: /home/user/.conda/envs/myenv/lib/python3.8/site-packages/torch/lib/../../nvidia/cusparse/lib/libcusparse.so.12: undefined symbol: __nvJitLinkAddData_12_1, version libnvJitLink.so.12

I tried the solution suggested by @ptrblck and the error is not fixed, yet!

May I ask which solution ? I saw the same issue with below trial ( python version == 3.10)

Below is my env setup and issue:

pip3 install torch torchvision torchaudio

pip list |grep torch
torch                     2.1.0
torchaudio                2.1.0
torchvision               0.16.0

cat /proc/driver/nvidia/version
NVRM version: NVIDIA UNIX x86_64 Kernel Module  525.105.17  Tue Mar 28 18:02:59 UTC 2023
GCC version:  gcc version 11.4.0 (Ubuntu 11.4.0-1ubuntu1~22.04)

python3 -c "import torch;"

from torch._C import *  # noqa: F403
ImportError: /usr/local/lib/python3.10/dist-packages/torch/lib/../../nvidia/cusparse/lib/libcusparse.so.12: undefined symbol: __nvJitLinkAddData_12_1, version libnvJitLink.so.12

Workaround:

When downgrade torch to 2.0.1 (pip3 install torch==2.0.1), issue gone.

kk19990709 · 2023-10-21T15:01:40Z

I don't think this issue should be closed. It hasn't been solved yet. Same Error within torch==2.1.0

kk19990709 · 2023-10-21T15:17:54Z

I had the same problem of this issue: File "/home/user/.conda/envs/myenv/bin/torchrun", line 5, in <module> from torch.distributed.run import main File "/home/user/.conda/envs/myenv/lib/python3.8/site-packages/torch/__init__.py", line 235, in <module> from torch._C import * # noqa: F403 ImportError: /home/user/.conda/envs/myenv/lib/python3.8/site-packages/torch/lib/../../nvidia/cusparse/lib/libcusparse.so.12: undefined symbol: __nvJitLinkAddData_12_1, version libnvJitLink.so.12我遇到了同样的问题： File "/home/user/.conda/envs/myenv/bin/torchrun", line 5, in <module> from torch.distributed.run import main File "/home/user/.conda/envs/myenv/lib/python3.8/site-packages/torch/__init__.py", line 235, in <module> from torch._C import * # noqa: F403 ImportError: /home/user/.conda/envs/myenv/lib/python3.8/site-packages/torch/lib/../../nvidia/cusparse/lib/libcusparse.so.12: undefined symbol: __nvJitLinkAddData_12_1, version libnvJitLink.so.12
I tried the solution suggested by @ptrblck and the error is not fixed, yet!我尝试了建议的解决方案，但错误尚未修复！

May I ask which solution ? I saw the same issue with below trial ( python version == 3.10)请问哪种解决方案？我在下面的试用版（python版本== 3.10）中看到同样的问题

Below is my env setup and issue:以下是我的环境设置和问题：
pip3 install torch torchvision torchaudio

pip list |grep torch
torch                     2.1.0
torchaudio                2.1.0
torchvision               0.16.0

cat /proc/driver/nvidia/version
NVRM version: NVIDIA UNIX x86_64 Kernel Module  525.105.17  Tue Mar 28 18:02:59 UTC 2023
GCC version:  gcc version 11.4.0 (Ubuntu 11.4.0-1ubuntu1~22.04)
python3 -c "import torch;"

from torch._C import *  # noqa: F403
ImportError: /usr/local/lib/python3.10/dist-packages/torch/lib/../../nvidia/cusparse/lib/libcusparse.so.12: undefined symbol: __nvJitLinkAddData_12_1, version libnvJitLink.so.12
Workaround: 解决方法：

When downgrade torch to 2.0.1 (pip3 install torch==2.0.1), issue gone.将割炬降级到 2.0.1 （ pip3 install torch==2.0.1 ）时，问题消失了。

This works, but some package require torch==2.1.0, such as xformers

upenn-hughmac · 2023-10-21T22:37:33Z

Same issue. In case it's useful for others, fixed for me by either:

export LD_LIBRARY_PATH=$HOME/path/to/my/venv3115/lib64/python3.11/site-packages/nvidia/nvjitlink/lib:$LD_LIBRARY_PATH

or uninstalling 2.1.0 (stable) and installing the nightly dev preview:

python -m pip uninstall torch torchvision torchaudio
python -m pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu121

conan1024hao · 2023-11-07T09:12:25Z

When downgrade torch to 2.0.1 (pip3 install torch==2.0.1), issue gone.

torch wasn't the problem to me, downgrade torch audio to 2.0.1, issue gone.

krodio · 2023-11-08T17:16:24Z

Same issue. In case it's useful for others, fixed for me by either:
export LD_LIBRARY_PATH=$HOME/path/to/my/venv3115/lib64/python3.11/site-packages/nvidia/nvjitlink/lib:$LD_LIBRARY_PATH
or uninstalling 2.1.0 (stable) and installing the nightly dev preview:
python -m pip uninstall torch torchvision torchaudio
python -m pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu121

it did work, replace xxxx to the real python interpreter path
export LD_LIBRARY_PATH=$HOME/xxxx/python3.11/site-packages/nvidia/nvjitlink/lib:$LD_LIBRARY_PATH

NintendoLink · 2023-11-23T06:13:31Z

it's work,thanks~ @upenn-hughmac

surmount1 · 2023-12-26T02:40:36Z

Same issue. In case it's useful for others, fixed for me by either:
export LD_LIBRARY_PATH=$HOME/path/to/my/venv3115/lib64/python3.11/site-packages/nvidia/nvjitlink/lib:$LD_LIBRARY_PATH
or uninstalling 2.1.0 (stable) and installing the nightly dev preview:
python -m pip uninstall torch torchvision torchaudio
python -m pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu121

Nice! I also solved this problem using this method.
export LD_LIBRARY_PATH=/data/home/user/anaconda3/envs/vllm/lib/python3.10
/site-packages/nvidia/nvjitlink/lib:$LD_LIBRARY_PATH

jianbo27 · 2024-01-11T01:57:18Z

I try to downgrade to python3.9, which works for me in conda virtual environment.

wangbluo · 2024-02-02T02:16:31Z

Same issue on torch2.2, I have tried all above solutions but failed, this issue shouldn't be closed at all.

python version: python 3.10
PyTorch-cuda:12.1
$CUDA_HOME: /home/share/spack/opt/spack/linux-ubuntu20.04-icelake/gcc-9.4.0/cuda-12.1.1-uxo2fr2s3d6ge4m6bo46jslallfbluei

osma · 2024-02-09T09:17:12Z

I've also had this problem. In my case, it was apparently due to a compatibility issue w.r.t. CUDA 12.0.0 that I was using.

It appears that PyTorch 2.1.x and 2.2.0 have been compiled against CUDA 12.1.0 and they use new symbols introduced in 12.1 so they won't work with CUDA 12.0.0. Installing either CUDA 12.1.0 or the older version 11.8.0 fixes the problem for me. Downgrading to PyTorch 2.0.1 also works, as it's compatible with CUDA 12.0.0.

yangfansun · 2024-03-06T02:55:59Z

Installing PyTorch with the official CUDA 11.8 setup recommended by PyTorch can fix this problem.

richardp4 · 2024-04-17T02:34:27Z

Hi,
I got the same problem.
My conditions are below.

OS : ubuntu 22.04
CUDA : 12.0
cudnn : 8.8
python : 3.9 anaconda env
pytorch : 2.2.2 -> 2.0.1 (down grade)

After downgrading the pytorch version from 2.2.2 to 2.0.1, import torch is good.
However, another error occurred like "ModuleNotFoundError: No module named 'torch._custom_ops'"

Please give me some tips to solve it.

osma · 2024-04-17T07:35:28Z

@richardp4 CUDA version 12.0 is your problem. Either upgrade it to 12.1+ or downgrade to 11.8.

M3Dade · 2024-04-17T11:31:04Z

downgrade to 11.8.

@osma I meet the problem by downloading flash_attn-2.5.7.tar.gz. It's worked when I downgrade CUDA version from 12.0 to 11.8. Thanks for your suggestion.

weifengpy · 2024-04-27T07:29:59Z

Same issue. In case it's useful for others, fixed for me by either:
export LD_LIBRARY_PATH=$HOME/path/to/my/venv3115/lib64/python3.11/site-packages/nvidia/nvjitlink/lib:$LD_LIBRARY_PATH
or uninstalling 2.1.0 (stable) and installing the nightly dev preview:
python -m pip uninstall torch torchvision torchaudio
python -m pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu121

the quoted answer works for me. in case people curious about how to find site-packages path
step 1: python3 -m pip list -v: this shows full path 'envs/pytorch-3.10/lib/python3.10/site-package'
step 2: export LD_LIBRARY_PATH=full/path/envs/pytorch-3.10/lib/python3.10/site-packages/nvidia/nvjitlink/lib:$LD_LIBRARY_PATH

qianxifu closed this as completed Oct 19, 2023

lcneyc mentioned this issue Oct 24, 2023

Error occurs when I'm trying the demo "agentchat_RetrieveChat.ipynb". microsoft/autogen#370

Closed

CHELSEA234 mentioned this issue Apr 17, 2024

[Usage] ImportError: cannot import name 'LlavaLlamaForCausalLM' from 'llava.model' haotian-liu/LLaVA#1101

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ImportError: /usr/local/lib/python3.10/dist-packages/torch/lib/../../nvidia/cusparse/lib/libcusparse.so.12: undefined symbol: __nvJitLinkAddData_12_1, version libnvJitLink.so.12 #111469

ImportError: /usr/local/lib/python3.10/dist-packages/torch/lib/../../nvidia/cusparse/lib/libcusparse.so.12: undefined symbol: __nvJitLinkAddData_12_1, version libnvJitLink.so.12 #111469

qianxifu commented Oct 18, 2023 •

edited

ptrblck commented Oct 18, 2023

qianxifu commented Oct 19, 2023

lee101 commented Oct 19, 2023

lee101 commented Oct 19, 2023 •

edited

giovannibonisoli commented Oct 19, 2023 •

edited

panpan0000 commented Oct 20, 2023 •

edited

kk19990709 commented Oct 21, 2023

kk19990709 commented Oct 21, 2023

upenn-hughmac commented Oct 21, 2023

conan1024hao commented Nov 7, 2023 •

edited

krodio commented Nov 8, 2023

NintendoLink commented Nov 23, 2023

surmount1 commented Dec 26, 2023

jianbo27 commented Jan 11, 2024

wangbluo commented Feb 2, 2024

osma commented Feb 9, 2024

yangfansun commented Mar 6, 2024

richardp4 commented Apr 17, 2024

osma commented Apr 17, 2024

M3Dade commented Apr 17, 2024

weifengpy commented Apr 27, 2024

ImportError: /usr/local/lib/python3.10/dist-packages/torch/lib/../../nvidia/cusparse/lib/libcusparse.so.12: undefined symbol: __nvJitLinkAddData_12_1, version libnvJitLink.so.12 #111469

ImportError: /usr/local/lib/python3.10/dist-packages/torch/lib/../../nvidia/cusparse/lib/libcusparse.so.12: undefined symbol: __nvJitLinkAddData_12_1, version libnvJitLink.so.12 #111469

Comments

qianxifu commented Oct 18, 2023 • edited

🐛 Describe the bug

Versions

ptrblck commented Oct 18, 2023

qianxifu commented Oct 19, 2023

lee101 commented Oct 19, 2023

lee101 commented Oct 19, 2023 • edited

giovannibonisoli commented Oct 19, 2023 • edited

panpan0000 commented Oct 20, 2023 • edited

kk19990709 commented Oct 21, 2023

kk19990709 commented Oct 21, 2023

upenn-hughmac commented Oct 21, 2023

conan1024hao commented Nov 7, 2023 • edited

krodio commented Nov 8, 2023

NintendoLink commented Nov 23, 2023

surmount1 commented Dec 26, 2023

jianbo27 commented Jan 11, 2024

wangbluo commented Feb 2, 2024

osma commented Feb 9, 2024

yangfansun commented Mar 6, 2024

richardp4 commented Apr 17, 2024

osma commented Apr 17, 2024

M3Dade commented Apr 17, 2024

weifengpy commented Apr 27, 2024

qianxifu commented Oct 18, 2023 •

edited

lee101 commented Oct 19, 2023 •

edited

giovannibonisoli commented Oct 19, 2023 •

edited

panpan0000 commented Oct 20, 2023 •

edited

conan1024hao commented Nov 7, 2023 •

edited