Deep Learning with AMD GPU on Windows
First written:2022/09/26
Last updated:2023/07/15
In the past,using AMD GPU for DL,we need Linux with ROCm installed.
Now, Microsoft published DirectML, which makes any GPU supporting DirectX12 be able to be used for DL on Windows.
How to config
Choose PyTorch-DirectML
as an example:
(Optional) Install
MiniConda
environment,and set environment variables:Path\To\Miniconda3 Path\To\Miniconda3\Scripts Path\To\Miniconda3\Library\bin
Create a
Python
env(If withoutConda
, you also needPython3.8-3.10
)conda create -n directML python=3.8
Activate env and install dependencies
conda activate directML pip install torch-directml
Usage
import torch_directml dml = torch_directml.device() # device=dml
Run an Opensource Project
Choose GFPGAN from Tencent as an example, which aims at recover faces, with Real-ESRGAN built in, which can enhance the background.
Follow its Guide:
Mentioned: If you want to enhance the background, Real-ESRGAN installation is needed
After Guide, download pre-trained model:
wget https://github.com/TencentARC/GFPGAN/releases/download/v1.3.0/GFPGANv1.3.pth -P experiments/pretrained_models
You can use other tools to download to specific directory.
How to use:
Usage: python inference_gfpgan.py -i inputs/whole_imgs -o results -v 1.3 -s 2 [options]... -h show this help -i input Input image or folder. Default: inputs/whole_imgs -o output Output folder. Default: results -v version GFPGAN model version. Option: 1 | 1.2 | 1.3. Default: 1.3 -s upscale The final upsampling scale of the image. Default: 2 -bg_upsampler background upsampler. Default: realesrgan -bg_tile Tile size for background sampler, 0 for no tile during testing. Default: 400 -suffix Suffix of the restored faces -only_center_face Only restore the center face -aligned Input are aligned faces -ext Image extension. Options: auto | jpg | png, auto means using the same extension as inputs. Default: auto
Because
basicsr
andfacexlib
both depend ontorch
, you need to uninstalltorch
, and force to reinstallpytorch-directml
: (Dependency relations will not be satisfied, but ok)pip uninstall torch pip install --force-reinstall pytorch-directml
About Real-ESRGAN
The unoptimized RealESRGAN is slow on CPU. If you really want to use it on CPU, please modify the corresponding codes: (
inference_gfpgan.py
Line59)# ------------------------ set up background upsampler ------------------------ if args.bg_upsampler == 'realesrgan': from basicsr.archs.rrdbnet_arch import RRDBNet from realesrgan import RealESRGANer model = RRDBNet(num_in_ch=3, num_out_ch=3, num_feat=64, num_block=23, num_grow_ch=32, scale=2) bg_upsampler = RealESRGANer( scale=2, model_path='https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.1/RealESRGAN_x2plus.pth', model=model, tile=args.bg_tile, tile_pad=10, pre_pad=0, half=False) # use fp16, need to set False in CPU mode else: bg_upsampler = None
Everything is OK here.
DirectML Support
In
inference_gfpgan.py
: adddevice
parameter to constructors ofRealESRGANer
andGFPGANer
.import torch_directml dml = torch_directml.device() bg_upsampler = RealESRGANer(..., device=dml) restorer = GFPGANer(..., device=dml)
But there are currently two problems:
upsampler
triggers an assertion inbasicsr\archs\arch_util.py
after processing several tiles, which is not solved:Assert that hh % scale == 0 and hw % scale == 0
If
realesrgan
is disabled, the face can be processed normally, but result is very weird.In order to avoid discomfort, no results are shown here.
The CPU output are normal, but the DirectML output is abnormal.
After analysis, the problem may be in the bilinear interpolation calculation, I have raised an issue in the DirectML project, click here.