济宁网站建设云科网络,定兴县住房和城乡建设局网站,下载做蛋糕网站,重庆优化官网服务因为ubuntu22.04的RDP不支持声音转发#xff0c;所以下载了ubuntu23.04.但官方的rocm二进制包最高只支持ubuntu22.04#xff0c;不支持ubuntu 23.04#xff0c;只能自己从源码编译虽然有网友告诉我可以用docker运行rocm。但是我已经研究了好几天#xff0c;沉没成本太多所以下载了ubuntu23.04.但官方的rocm二进制包最高只支持ubuntu22.04不支持ubuntu 23.04只能自己从源码编译虽然有网友告诉我可以用docker运行rocm。但是我已经研究了好几天沉没成本太多无奈只能继续硬着头皮研究终于搞定了。记录下结果可能有遗漏顺序也可能不对仅供参考。如果CMAKE有错误可以试试添加或者删除-DCMAKE_PREFIX_PATH/opt/rocm/,毕竟为了避免冲突rocm的库独立于系统主库了。
首先要明确一点rocm安装的位置在/opt/rocm/目录下而rocm-llvm工具链安装目录为/opt/rocm/llvm这个可以通过解包官方的deb包来知道。注意不要把/opt/rocm/llvm目录下的东西安装到/opt/rocm/目录下不然会报错注意这个-DCMAKE_INSTALL_PREFIX/opt/rocm/参数指定安装位置默认是/usr/local。
如果不小心装错位置了可以使用命令
sudo grep -lrIZ https://llvm.org/LICENSE.txt . | sudo xargs -0 rm -f --删除LLVM/opt/rocm/lib目录下相关的库。
一、编译安装llvm-rocm工具链
安装这个工具链之前确保已经有其他工具链已经被安装。可以是llvm也可以是gnu。建议第一次make install的时候不要加sudo避免装错位置。
mkdir -p /opt/rocm/llvm
cd
git clone https://github.com/RadeonOpenCompute/llvm-project.git -b amd-stg-open
cd
mkdir build
cd build/
cmake -DCMAKE_BUILD_TYPERelease -DLLVM_ENABLE_PROJECTSclang;lld
-DLLVM_ENABLE_RUNTIMESlibcxx;libcxxabi;libunwind;compiler-rt
-DLLVM_TARGETS_TO_BUILDAMDGPU;X86
-DCMAKE_INSTALL_PREFIX/opt/rocm/llvm../llvm
sudo make install
cd ../amd/device-libs
mkdir build
cmake \-DCMAKE_BUILD_TYPERelease \-DCMAKE_PREFIX_PATH/opt/rocm/llvm \-DCMAKE_INSTALL_PREFIX/opt/rocm/ \..
sudo make install
cd ../amd/comgr
mkdir build
cmake \-DCMAKE_BUILD_TYPERelease \-DCMAKE_PREFIX_PATH/opt/rocm/llvm;/opt/rocm/ \-DCMAKE_INSTALL_PREFIX/opt/rocm/ \..
sudo make install二、编译安装hip工具链
参考https://github.com/ROCm-Developer-Tools/HIP/blob/develop/docs/developer_guide/build.md
sudo apt-get install -y libelf-dev
export ROCM_BRANCHrocm-5.7.x
git clone -b $ROCM_BRANCH https://github.com/ROCm-Developer-Tools/clr.git
git clone -b $ROCM_BRANCH https://github.com/ROCm-Developer-Tools/hip.git
git clone -b $ROCM_BRANCH https://github.com/ROCm-Developer-Tools/HIPCC.git hipcc
export CLR_DIR$(readlink -f clr)
export HIP_DIR$(readlink -f hip)
export HIPCC_DIR$(readlink -f hipcc)
cd $HIPCC_DIR
mkdir -p build; cd build
cmake ..
make -j4
cd $CLR_DIR
mkdir -p build; cd build
cmake -DHIP_COMMON_DIR$HIP_DIR -DHIP_PLATFORMamd -DCMAKE_PREFIX_PATH/opt/rocm/ -DCMAKE_INSTALL_PREFIX/opt/rocm/ -DHIPCC_BIN_DIR$HIPCC_DIR/build -DHIP_CATCH_TEST0 -DCLR_BUILD_HIPON -DCLR_BUILD_OCLOFF ..
make -j$(nproc)
sudo make install
三、编译安装rocm-runtime
参考https://github.com/RadeonOpenCompute/ROCR-Runtime/tree/master/src
git clone https://github.com/RadeonOpenCompute/ROCT-Thunk-Interface.git
mkdir -p ROCT-Thunk-Interface/build
cd ROCT-Thunk-Interface/build
cmake -DCMAKE_INSTALL_PREFIX/opt/rocm ..
sudo make install
git clone https://github.com/RadeonOpenCompute/ROCR-Runtime.git
mkdir -p src/build
cd src/build
cmake -DCMAKE_INSTALL_PREFIX/opt/rocm ..
sudo make install四、编译安装rCCL 时间比较长占用内存六十多个G如果内存不够的话可以加大swap文件或者使用zRAM。
sudo apt install zram-config
sudo nano /usr/bin/init-zram-swapping
把里面的mem$((totalmem / 2 * 1024)) 改为mem$((totalmem * 2 * 1024)) 然后reboot重启这样就能使用自身内存两倍的zram
git clone https://github.com/RadeonOpenCompute/rocminfo.git -b rocm-5.7.x
cd rocminfo/
mkdir -p build
cd build
cmake -DCMAKE_PREFIX_PATH/opt/rocm ..
sudo make installpython3 -m pip install CppHeaderParser
git clone https://github.com/RadeonOpenCompute/rocm_smi_lib.git
cd rocm_smi_lib/
mkdir -p build
cd build
cmake ..
sudo make install
git clone https://github.com/ROCmSoftwarePlatform/rccl.git
cd rccl
sudo ./install.sh -i五、设置环境变量
参考https://docs.amd.com/en/docs-5.1.3/deploy/linux/os-native/install.html
sudo tee --append /etc/ld.so.conf.d/rocm.conf EOF
/opt/rocm/lib
/opt/rocm/lib64
EOF
sudo ldconfig六、安装tensorflow-rocm测试
sudo apt install tensorflow-rocm scikit-learn scipy matplotlib然后测试
可能遇到的错误 Could not find NUMA using the following names: numa 解决办法:
apt-get install libnuma-devrocm_smi/rocm_smi.h’ file not found 解决办法: 参考步骤四安装smi库 importError: cannot import name ‘np_utils’ from ‘keras.utils’ 解决办法: 修改代码直接ffrom keras import utils 然后 utils.to_categorical(…) librccl.so.1: cannot open shared object file: No such file or directory 解决方法完成步骤四 lang: error: invalid target ID ‘gfx941’; format is a processor name followed by an optional colon-delimited list of features followed by an enable/disable sign (e.g., ‘gfx908:sramecc:xnack-’) 解决方法完成步骤一、安装最新rocm-llvm工具链且确保设置-DCMAKE_PREFIX_PATH/opt/rocm/ Could not find a configuration file for package “hsa-runtime64” 解决方法完成步骤三