Releases: zhengxwen/HIBAG.gpu
Releases · zhengxwen/HIBAG.gpu
v0.99.1
CHANGES IN VERSION 0.99.1 (2023 Nov)
- the R global option variable 'HIBAG_GPU_INIT_TRAIN_PREC' can be set before loading the HIBAG.gpu package via e.g.,
options(HIBAG_GPU_INIT_TRAIN_PREC="half")
. It should be NULL (unset), 'auto', 'half', 'mixed', 'single' or 'double'. It can be used without callinghlaGPU_Init(,train_prec="")
to reset the training precision. - fix the GPU memory leaks
CHANGES IN VERSION 0.99.0 (2021 Oct)
- remove the dependency of the OpenCL R package
- reimplement the HIBAG GPU algorithm for speed-up
- new implementation using half and mixed precisions
- a new function
hlaAttrBagging_MultiGPU()
to leverage multiple GPU devices
Pre-release v0.9.2
- add KIR information
- optimize the GPU kernel by avoiding unnecessary single-precision calculation
- support the Windows platform
First pre-release version
Performance
The ratios of running times for training HIBAG models:
CPU (1 core) | CPU (1 core, POPCNT) | 1x NVIDIA Tesla K80 | 1x NVIDIA Tesla M40 | 1x NVIDIA Tesla P100 |
---|---|---|---|---|
1 | 1.63 x | 24.3 x | 35.4 x | 121.5 x |
using HIBAG v1.14.0 and HIBAG.gpu v0.9.0
CPU (1 core), the default installation from Bioconductor supporting SIMD SSE2 instructions, using Intel(R) Xeon(R) CPU E5-2630L @2.40GHz
CPU (1 core, POPCNT), optimization with Intel/AMD POPCNT instruction, using Intel(R) Xeon(R) CPU E5-2630L @2.40GHz
This work was made possible, in part, through HPC time donated by Microway, Inc. We gratefully acknowledge Microway for providing access to their GPU-accelerated compute cluster (http://www.microway.com/gpu-test-drive/).